Category: Information access

JSTOR, the electronic archive of academic journal articles, has been in the news this week. A programmer charged with massive theft turns out to be a 24 year old Harvard researcher named Aaron Swartz, who downloaded 4.8 million articles from JSTOR to hard disk, using a script. His identity was known, and JSTOR involved the police:

Swartz was charged with computer intrusion, fraud, and data theft. If convicted, he faces a maximum of 35 years in prison, restitution and forfeiture, and a fine of $1 million. A PDF of the indictment is here. …

Members of Demand Progress, a nonprofit political action group Swartz founded, criticized the indictment.

“This makes no sense,” the group’s executive director, David Segal, said in a statement. “It’s like trying to put someone in jail for allegedly checking too many books out of the library.”

A critic of academic publishers has uploaded 19,000 scientific papers to the internet to protest the prosecution of a prominent programmer and activist accused of hacking into a college computer system and downloading almost 5 million scholarly documents from an archive service.

The 18,592 documents made available Wednesday through Bittorrent were pulled from the Philosophical Transactions of the Royal Society, a prestigious scientific journal that was founded in the 1600s, the protester said. Even though the vast majority of the documents are hundreds of years old, the London-based Royal Society charges from $8 to $19 for each one, and restricts viewing to one person on one computer for only a single month.

“If I can remove even one dollar of ill-gained income from a poisonous industry which acts to suppress scientific and historic understanding, then whatever personal cost I suffer will be justified – it will be one less dollar spent in the war against knowledge,” Gregory Maxwell, self-described hobbyist scientist from Northern Virginia, wrote in a manifesto accompanying the upload. “One less dollar spent lobbying for laws that make downloading too many scientific papers a crime.”

…

Academics and copyright critics immediately criticized the charges as excessive, likening them to trying to put someone in jail for checking out too many library books. They argue that many of the documents in JSTOR’s collection are probably kept behind its paywall against the authors’ will and that there are no valid copyright claims restricting their distribution.

Indeed, court documents charging Swartz contain no claims of copyright violations. Instead, they cite Swartz for intrusion of MIT’s computer network and for impairing JSTOR’s systems by using an automated script that systematically scraped its archive.

In an email to The Reg, Maxwell said he decided against uploading the documents anonymously to prevent anyone from falsely claiming Swartz was behind the move. All of the documents were published prior to 1923 to ensure they are all in the public domain.

The case is an extremely interesting one from many points of view. The charges are frivolous, since the details of how he accessed the data are, frankly, not the point at issue. These, clearly, are the best charges that the lawyers could find.

It is interesting — and probably telling — that JSTOR don’t want to put their claim of copyright to the court. I suspect their lawyers have advised them that there is nothing to gain, that at present almost everyone is respecting their exaggerated but untested claims, and that the only possible consequence of a judge looking over the matter will be to create case law which — since they currently get everything they want — would most likely restrict them in some way.

Maxwell has done precisely the right thing here, in my opinion, and I hope others will follow him. Let us all, by all means, protest legally in this way. The Royal Society’s greed — futile greed, because whoever would pay such a sum? — is indeed utterly poisonous. Nor is the Royal Society alone. A lot of British tax-funded institutions treat the web as a mechanism to extort money, rather than a means to contribute to society.

At the same time, we need to recognise that JSTOR do have a problem here. They are not altogether the bad guys. The problem, succintly, is bad law. JSTOR are uploading material created, in the main, by scholars paid by the taxpayer. But JSTOR can’t pay its bills unless it charges. It can’t charge unless it restricts access to institutions. One infuriating aspect: while charging you and I to use it — we have, of course, already paid for it once in taxes –, it gives free access to the inhabitants of third-world despotisms.

The answer, surely, is for the government to take over JSTOR and fund it from taxes. It makes no sense for us to pay scholars to create material, with all the facilities involved, and then pay again to access it via a different mechanism, which restricts access to a few. Treat it as what it is — a library funded by the public — and remove all the layers of public money going here and there. It will undoubtedly be cheaper, involve less administration, and benefit the world.

Some might say that academic publishers only allow material on JSTOR because it is subscription, and they get a cut of the cash. This is probably true. But this in turn points up how academic publishing is no longer the benefactor of the world that it was in the days of print. When the only technology for articles was paper journals, these presses performed a service. But now? Technology has rendered that distribution mechanism obsolete, and the funding structure that supported it, harmlessly, is now a barrier to access. This too, I think, will change.

The outcome of the case must be of great interest to all of us. I do hope that the issues are confronted squarely.

UPDATE: There is a thoughtful article at the New Yorker here. This adds the important detail that JSTOR says that, after calling the cops, it “considered its dealings with Swartz complete” once Swartz had deleted his copies of the download.

Just press the large orange banner and then type in the Greek name of any Church Father (or the name of Latin Fathers in regular fonts). It’s amazing what books are available there. I am not sure what is or isn’t available on archive.org but there’s tons of stuff here.

Hmm. I think you have to enter Greek text, but this sounds *very* interesting!

UPDATE: Stephan adds:

There are also handwritten copies of obscure manuscripts I didn’t know existed especially at the library of Zagoras. That library was part of a center of learning started by a rich Greek merchant named John Priggos who sent a thousand books from Amsterdam c 1762. The library has an interesting history

World temperatures did not rise from 1998 to 2008, while manmade emissions of carbon dioxide from burning fossil fuel grew by nearly a third, various data show.

The researchers from Boston and Harvard Universities and Finland’s University of Turku said pollution, and specifically sulphur emissions, from coal-fueled growth in Asia was responsible for the cooling effect.

Is this right? That in the last ten years there was no global warming?

Yet here in the UK we have had night after night of “news” reports, running as if they were news, telling us in alarming terms that the world was doomed, showing pictures of melting ice-floes (in summer!) It subsided quite a bit after the scandal of forged data at the University of East Anglia. The guilty men were found innocent by their peers — funny that — but the mud stuck. There was no getting around the fact that they concealed the data, and that it took a hacker to reveal that they did so intentionally and in words capable of the worst interpretation. But the idea of warming still lingers.

Now I don’t have a view on the technical issues. And doubtless readers of this blog have various views on the political platforms that depend on pro- and anti-global warming stuff. This is not a blog about climate change or global warming, and I don’t propose to address that.

What concerns me is the information access issue. The real issue for me here, if the report is true, is the honesty issue, the poisoning of the public with a lie whose consequences — lightbulbs, ‘green’ taxes — affect everyone directly. Whatever our opinions, we all need accurate data, honestly reported.

But if this report is true — and I have no means of knowing — then we have all been subjected to a deliberate campaign of lies and evasions that would make Goebbels gasp with admiration.

For how could people NOT know that the world was not getting warmer? I wouldn’t know; but there are people whose job it is to know. The money exacted from me in taxes goes to pay their salaries.

This is deeply troubling on so many levels. We rely on a more or less free system of mass communication. To watch it be corrupted in this way raises the obvious question: what else are we not being told? What else is being distorted.

If the answer is “a lot”, then what do we do? We don’t want to become the sort of lunatic obsessed with conspiracies.

Perhaps the answer is to read widely. Watch Russia Today. Watch al-Jazeera. And so on?

Like this:

Mike Aquilina draws my attention to a new arrival on Gutenberg, the old SPCK translation of letters and treatises by Dionysius of Alexandria. It’s here, and done rather splendidly! I didn’t even know the book existed, or I should long since have scanned it.

A curious report here from the BBC. Apparently a Coptic business man has reposted a cartoon of Mickey and Minne Mouse in Moslem dress. I found the Minnie mouse one online, which I attach; I couldn’t locate the other. Apparently a Moslem cartoonist has — rightly — retaliated with a cartoon of said businessman, which again I have not seen. And extremist Moslem leaders are calling for his head for being disrespectful. Nothing special there.

But much more important is how the BBC reports the situation in Egypt.

The outcry comes at a time of tension between Egypt’s Christians and Muslims. …

But many have questioned his wisdom in sharing the cartoons at a time of tensions between Coptic Christians and conservative Muslims.

Scores of people have been wounded and several killed in clashes between the two communities in recent months, and there are fears this row will increase the chances of more sectarian clashes in the run up to post-revolution elections in September.

What is actually happening is an onslaught on the Coptic community by Moslem groups, now that Mubarak is out of power, as can be seen in many online news reports. But the phrasing plays that down, and carefully creates a false equivalence.

The BBC also uses the term “conservative” — the major British right-of-centre party — to describe the extremists. I’m sure the news team laughed as they did that.

It seems that the BL will allow Google to place 250,000 books published between 1700 and 1870 online. See AFP article here.

All the works will be available for text search download and reading on the British Library’s website www.bl.uk and at Google Books on books.google.co.uk.

The cost of digitising all 40 million pages will be borne by Google, which has entered similar partnerships with Stanford and Harvard universities in the United States as well as in the Netherlands, Italy and Austria.

BBC article is here, BL press release here (and do read a few other announcements first, and laugh to see how many merely repeat the press release).

I was reading John of Damascus in NPNF Series 2, and a comparison was made to Theodoret’s “Epitome of Divine Dogmas.” I tried searching with Google but gave up. Do you know of an available English translation?

The reference is to the prologue here, “From the Latin of the Edition of Michael Lequien, as Given in Migne’s Patrology”. The NPNF says:

After the rules of Christian dealectic and the review of the errors of ancient heresies comes at last the book “Concerning the Orthodox Faith.” In this book, John of Damascus retains the same order as was adopted by Theodoret in his “Epitome of Divine Dogmas,” but takes a different method.

Looking in Quasten’s Patrology reveals no such work by Theodoret; in Migne, vols. 80-84, nothing either. Le Quien’s preface to John Damascene is PG94, columns 66-97. But I could find no such sentence in it.

But my correspondent was luckier, and found a reference in a Word document at the Documenta Catholica Omnia site accessible from here. In the Life and Writings, which takes a while to download. It contains the following:

(ii.) The Haereticarum Fabularum Compendium, was composed at the request of Sporacius, one of the representatives of Martian at Chalcedon, and is, as its title indicates, an account of past or present heresies. It is divided into five. Books, which treat of the following heretics.

IV. Arius, Eudoxius, Etmomius, Aetius, the Psathyriani, the Macedoniani, the Donatists, the Meletians, Appollinarius, the Audiani, the Messaliani, Nestorius, Eutyches. V. The last book is an “Epitome of the Divine Decrees.”

So it is, in fact, part of the Compendium of the Fables of the Heretics. My correspondent added:

…there was enough to cross-reference at Source Chrétiennes where I found this:

The work ought to exist in English but I don’t believe it does. Nor does it appear in the French Sources Chretiennes series. The work is apparently derivative of earlier works, which probably explains the neglect by scholars.

They say a leopard cannot change its spots, and too often, this is so. Over the last few years I have documented various outrageous examples of greed and cynicism by the British Library.

The BL is, remember, a body entirely funded by the money of others. That money is not given freely. It is exacted by the state under threat of imprisonment from people who (in the main) cannot use the library or its facilities.

Now there is an argument for a national library, as a centre of learning, funded by taxes in order to benefit everyone. In the age of the internet, it could and should act like Google Books, placing books online in PDF’s to disseminate knowledge.

But that is not what the BL does. Instead, those who control it keep trying to use the internet to get money, rather than to serve the nation. Unlike the internet model, where everyone gives away content, they keep trying to exploit it commercially.

The books have been scanned in high resolution and color so you can see the engraved illustrations, the beauty of the embossed covers, along with maps and even the texture of the paper the books were printed on.

You can search the collection, browse titles by subject, and even read commentary on some of the titles. The books can be downloaded for reading offline. …

The app only works in portrait mode, but some of the illustrations are oriented in landscape view. …

Yup. It’s not a set of books. It’s an “app”. In other words, the books are locked inside some proprietary software. As soon as I saw that bit, I knew. I could smell it. And sure enough…

Although the app is free, the British Library plans to charge for an enhanced version of 60,000 titles later this year.

You bet they do.

Let us thank heavens for Google Books. Thank heavens for Archive.org.

And a raspberry to these loathsome little civil servants, selling what is not theirs to sell, in an age when even ordinary chaps like me give content away.

Additional MS 14771 – 10th c. Gregory Nazianzen!!! — a bunch of his orations (1, 45, 44, 41, 21, 15, 38, 43, 39, 40, 11, 14, 42, 16), including the funeral oration for Basil the Great. The ms. starts with a table of contents in red uncial. I was once told such tables of contents were rare! This manuscript once belonged to Niccolo Niccoli in Florence, then to the monastery of St. Mark, where Niccoli’s books went after his death. Evidently someone stole it and sold it on.

I was a bit afraid after the opening section that it would all be gospel mss.! But thankfully not — there are some gems in there. But what does smack you in the face is the need for a course in Greek paleography in order to make much of them.

Do add that blog to your RSS feeder. They don’t post that often, but all the posts are interesting and useful, and usually illustrated with some precious page image.

I have just placed an order for a photocopy of the catalogue of the manuscripts of the Vlatadon monastery in Thessalonika. This is the place which had the unknown Galen manuscript, which recovered such treasures for us. I’ve ordered it from the French National Library using their online (and unduly complex) form. Here’s their catalogue entry:

It was interesting using their form, because it showed you what it cost. I first asked for digital images, sent by email. They wanted 6,000 euros for that (!). I then asked for photocopies and that was merely 34 euros. So that’s what I ordered.