In an extraordinary story, Jorge Luis Borges writes of a “Total Library”, organized into ‘hexagons’ that supposedly contained all books:

When it was proclaimed that the Library contained all books, the first impression was one of extravagant happiness. All men felt themselves to be the masters of an intact and secret treasure. . . . At that time a great deal was said about the Vindications: books of apology and prophecy which . . . [contained] prodigious arcana for [the] future. Thousands of the greedy abandoned their sweet native hexagons and rushed up the stairways, urged on by the vain intention of finding their Vindication. These pilgrims disputed in the narrow corridors . . . strangled each other on the divine stairways . . . . Others went mad. . . . The Vindications exist . . . but the searchers did not remember that the possibility of a man’s finding his Vindication, or some treacherous variation thereof, can be computed as zero. As was natural, this inordinate hope was followed by an excessive depression. The certitude that some shelf in some hexagon held precious books and that these precious books were inaccessible, seemed almost intolerable.

About three years ago I spent almost an entire sleepless month coding OpenJudis – my rather cool, “first-of-its-kind” free online database of Indian Supreme Court cases. The database hosts the full texts of about 25,000 cases decided since 1950. In this post I embark on a somewhat personal reflection on the process of creating OpenJudis – what I learnt about access to law (in India), and about “legal informatics,” along with some meditations on future pathways.

Having, by now, attended my share of FLOSS events, I know it is the invariable tendency of anyone who’s written two lines of free code to consider themselves qualified to pronounce on lofty themes – the nature of freedom and liberty, the commodity, scarcity, etc. With OpenJudis, likewise, I feel like I’ve acquired the necessary license to inflict my theory of the world on hapless readers – such as those at VoxPopuLII!

I begin this post by describing the circumstances under which I began coding OpenJudis. This is followed by some of my reflections on how “legal informatics” relates to and could relate to law.

Online Access to Law in India
India is privileged to have quite a robust ICT architecture. Internet access is relatively inexpensive, and the ubiquity of “cyber cafes” has resulted in extensive Internet penetration, even in the absence of individual subscriptions.

The NIC has also been commissioned by the judiciary to develop websites for courts at various levels and publish decisions online. As a result, beginning in around the year 2000, the Supreme Court and various high courts have been publishing their decisions on their websites. The full texts of all Supreme Court decisions rendered since 1950 have been made available, which is an invaluable free resource for the public. Most High Court websites however, have not yet made archival material available online, so at present, access remains limited to decisions from the year 2000 onwards. More recently the NIC has begun setting up websites for subordinate courts, although this process is still at a very embryonic stage.

Apart from free government websites, a handful of commercial enterprises have been providing online access to legal materials. Among them, two deserve special mention. SCCOnline – a product of one of the leading law report publishers in India – provides access to the full texts of decisions of the Indian Supreme Court. The CD version of SCCOnline sells for about INR 70,000 (about US$1,500), which is around the same price the company charges for a full set of print volumes of its reporter. For an additional charge, the company offers updates to the database. The other major commercial venture in the field is Manupatra, which offers access to the full text of decisions of various courts and tribunals as well as the texts of legislation. Access is provided for a basic charge of about US$100, plus a charge of about US$1 per document downloaded. While seemingly modest by international standards, these charges are unaffordable by large sections of the legal profession and the lay public.

OpenJudis
In December 2006, I began coding OpenJudis. My reasons were purely selfish. While the full texts of the decisions of the Supreme Court were already available online for free, the search engine on the government website was unreliable and inadequate for (my) advanced research needs. The formatting of the text of cases themselves was untidy, and it was cumbersome to extract passages from them. Frequently, the website appeared overloaded with users, and alternate free sources were unavailable. I couldn’t afford any of the commercial databases. My own private dissatisfaction with the quality of service, coupled with (in retrospect) my completely naive optimism, led me to attempt OpenJudis. A third crucial factor on the input side was time, and a “room of my own,” which I could afford only because of a generous fellowship I had from the Open Society Institute.

I began rashly, by serially downloading the full texts of the 25,000 decisions on the Supreme Court website. Once that was done (it took about a week), I really had no notion of how to proceed. I remember being quite exhilarated by the sheer fact of being in possession of twenty five thousand Supreme Court decisions. I don’t think I can articulate the feeling very well. (I have some hope, however, that readers of this blog and my fellow LII-ers will intuitively understand this feeling.) Here I was, an average Joe poking around on the Internet, and just-like-that I now had an archive of 25,000 key documents of our republic, cumulatively representing the articulations of some of the finest (and some not-so-fine) legal minds of the previous half-century, sitting on my laptop. And I could do anything with them.

The word “archive,” incidentally, as Derrida informs us, derives from the Greek arkheion, the residence of the superior magistrates, the archons – those who commanded. The archons both “held and signified political power,” and were considered to possess the right to both “make and represent the law.” “Entrusted to such archons, these documents in effect speak the law: they recall the law and call on or impose the law”. Surely, or I am much mistaken, a very significant transformation has occurred when ordinary citizens become capable of housing archives – when citizens can assume the role of archons at will.

Giddy with power, I had an immediate impulse to find a way to transmit this feeling, to make it portable, to dissipate it – an impulse that will forever mystify economists wedded to “rational” incentive-based models of human behavior. I wasn’t a computer engineer, I didn’t have the foggiest idea how I’d go about it, but I was somehow going to host my own online free database of Indian Supreme Court cases. The audacity of this optimism bears out one of Yochai Benkler‘s insights about the changes wrought by the new “networked information economy” we inhabit. According to Benkler,

The belief that it is possible to make something valuable happen in the world, and the practice of actually acting on that belief, represent a qualitative improvement in the condition of individual freedom [because of NIE]. They mark the emergence of new practices of self-directed agency as a lived experience, going beyond mere formal permissibility and theoretical possibility.

Without my intending it, the archive itself suggested my next task. I had to clean up the text and extract metadata. This process occupied me for the longest time during the development of OpenJudis. I was very new to programming and had only just discovered the joys of Regular Expressions. More than my inexperience with programming techniques, however, it was the utter heterogeneity of reporting styles that took me a while to accustom myself to. Both opinion-writing and reporting styles had changed dramatically in the course of the fifty years my database covered, and this made it difficult to find patterns when extracting, say, the names of judges involved. Eventually, I had cleaned up the texts of the decisions and extracted an impressive (I thought) set of metadata, including the names of parties, the names of the judges, and the date the case was decided. To compensate for the absence of headnotes, I extracted names of statutes cited in the cases as a rough indicator of what their case might relate to. I did all this programming in PHP with the data housed in a MySQL database.

And then I encountered my first major roadblock that threatened to jeopardize the whole operation: I ran my first full-text Boolean search on the MySQL database and the results took a staggering 20 minutes to display. I was devastated! More elaborate searches took longer. Clearly, this was not a model I could host online. Or do anything useful with. Nobody in their right mind would want to wait 20 minutes for the results of their search. I had to look for a quicker database, or, as I eventually discovered, a super fast, lightweight indexing search engine. After a number of failed attempts with numerous free search engine software programs, none of which offered either the desired speed or the search capability I wanted, I was getting quite desperate. Fortunately, I discovered Swish-e, a lightweight, Perl-based Boolean search engine which was extremely fast and, most importantly, free – exactly what I needed. The final stage of creating the interface, uploading the database, and activating the search engine happened very quickly, and sometime in the early hours of December 22nd, 2006, OpenJudis went live. I sent announcement emails out to several e-groups and waited for the millions to show up at my doorstep.

They never did. After a week, I had maybe a hundred users. In a month, a few hundred. I received some very complimentary emails, which was nice, but it didn’t compensate for the failure of “millions” to show up. Over the next year, I added some improvements:
1) First, I built an automatic update feature that would periodically check the Supreme Court website for new cases and update the database on its own.
2) In October 2007, I coded a standalone MS Windows application of the database that could be installed on any system running Windows XP. This made sense in a country where PC penetration is higher than Internet penetration. The Windows application became quite popular and I received numerous requests for CDs from different corners of the country.
3) Around the same time, I also coded a similar application for decisions of the Central Information Commission – the apex statutory tribunal for adjudicating disputes under the Right to Information Act.
4) In February 2008, both applications were included in the DVD of Digit Magazine – a popular IT magazine in India.

Unfortunately, in August 2008, the Supreme Court website changed its design so that decisions could no longer be downloaded serially in the manner I had been accustomed to. One can only speculate about what prompted this change – since no improvements were made to the actual presentation of the cases. The only thing that changed was that one could no longer download cases serially as I’d been doing. The new format was far more difficult for me to “hack” and I abandoned the attempt. My work left me with no time to attempt to circumvent the new format.

Fortunately at the same time, an exciting new project called IndianKanoon was started by Sushant Sinha, an Indian computer science graduate at Michigan. In addition to decisions of the Supreme Court, his site covers several high courts and links up to the text of legislation of various kinds. Although I have not abandoned plans to develop OpenJudis, the presence of IndianKanoon has allowed me to step back entirely from this domain – secure in the knowledge that it is being taken forward by abler hands than mine.

Predictions, Observations, Conclusions
I’d like to end this already-too-long post with some reflections, randomly ordered, about legal information online.
1) I think one crucial area commonly neglected by most LIIs is client-side software that enables users to store local copies of entire databases. The urgency of this need is highlighted in the following hypothetical about digital libraries by Siva Vaidhyanathan (from The Anarchist in the Library):

So imagine this: An electronic journal is streamed into a library. A library never has it on its shelf, never owns a paper copy, can’t archive it for posterity. Its patrons can access the material and maybe print it, maybe not. But if the subscription runs out, if the library loses funding and has to cancel that subscription, or if the company itself goes out of business, all the material is gone. The library has no trace of what it bought: no record, no archive. It’s lost entirely.

It may be true that the Internet will be around for some time, but it might be worthwhile for LIIs to stop emulating the commercial database models of restricting control while enabling access. Only then can we begin to take seriously the task of empowering users into archons.

2) My second observation pertains to interface and usability. I have for long been planning to incorporate a set of features including tagging, highlighting, annotating, and bookmarking that I myself would most like to use. Additionally, I have been musing about using Web 2.0 to enable user-participation in maintenance and value-add operations – allowing users to proofread the text of judgments and to compose headnotes. At its most ambitious, in these “visions” of mine, OpenJudis looks like a combination of LII + social networking + Wikipedia.

Far from ensuring fixity or authority, this early history of Printing was marked by uncertainty, and the constant refrain for a long time was that you could not rely on the book; a French scholar Adrien Baillet warned in 1685 that “the multitude of books which grows every day” would cast Europe into “a state as barbarous as that of the centuries that followed the fall of the Roman Empire.”

Europe’s non-descent into barbarism offers us a degree of comfort in dealing with Adrien Baillet-type arguments made in the context of legal information. The stability that we ascribe to law reports today is a relatively recent historical innovation that began in the mid-19th century. “Modern” law has longer roots than that.

3) While OpenJudis may look like quite a mammoth endeavor for one person, I was at all times intensely aware that this was by no means a solitary undertaking, and that I was “standing on the shoulders of giants.” They included the nameless thousands at the NIC who continue to design websites, scan and upload cases on the court websites – a Sisyphian task – and the thousands whose labor collectively produced the free software I used : Fedora Core 4, PHP, MySQL, Swish-E. And lastly, the nameless millions who toil to make the physical infrastructure of the Internet itself possible. Like the ground beneath our feet, we take it for granted, even as the tragic recent events in Haiti in recent weeks remind us to be more attentive. (For a truly Herculean endeavor, however, see Sushant Sinha’s IndianKanoon website, about which many ballads may be composed in the decades to come.)

It might be worthwhile for the custodians of LIIs to enable users to become derivative producers themselves, to engage in “practices of self-directed agency” as Benkler suggests. Without sounding immodest, I think the real story of OpenJudis is how the Internet makes it plausible and thinkable for average Joes like me (and better-than-average people like Sushant Sinha) to think of waging unilateral wars against publishing empires.

4) So, what is the impact that all this ubiquitous, instant, free electronic access to legal information is likely to have on the world of law? In a series of lectures titled “Archive Fever,” the philosopher Derrida posed a similar question in a somewhat different context: What would the discipline of psychoanalysis have looked like, he asked, if Sigmund Freud and his contemporaries had had access to computers, televisions, and email? In brief, his answer was that the discipline of psychoanalysis itself would not have been the same – it would have been transformed “from the bottom up” and its very events would have been altered. This is because, in Derrida’s view:

The archive . . . in general is not only the place for stocking and for conserving an archivable content of the past. . . . No, the technical structure of the archiving archive also determines the structure of the archivable content even in its coming into existence and in its relationship to the future. The archivization produces as much as it records the event.

The implication, following Derrida, is that in the past, law would not have been what it currently is if electronic archives had been possible. And the obverse is true as well: in the future, because of the Internet, “rule of law” will no longer observe the logic of the stable trajectories suggested by its classical “analog” commentators. New trajectories will have to be charted.

5) In the same book, Derrida describes a condition he calls “Archive fever”:

It is to burn with a passion. It is never to rest, interminably, from searching for the archive right where it slips away. It is to run after the archive even if there’s too much of it. It is to have a compulsive, repetitive and nostalgic desire for the archive, an irrepressible desire to return to the origin, a homesickness, a nostalgia for the return to the most archaic place of absolute commencement.

I don’t know about other readers of VoxPopulII (if indeed you’ve managed to continue reading this far!), but for the longest time during and after OpenJudis, I suffered distinctively from this malady. I downloaded indiscriminately whole sets of data that still sit unused on my computer, not having made it into OpenJudis. For those in a similar predicament, I offer Borges’s quote with which I began this text, as a reminder of the foolishness of the notion of “Total Libraries.”