Features – Update to Evaluating Foreign and International Legal Databases on the Internet

Mirela Roznovschi is the Reference Librarian for International and Foreign Law at New York University School of Law Library. She holds a M.A. from the University of Bucharest (Romania), a M.L.S. from Pratt Institute, and a Certificate in Internet Technologies from New York University. Her activities include monitoring and evaluating foreign and international legal databases on the Internet, training law faculty and students to use the Internet for legal research, advising developing democracies on the building of electronic law libraries, and training librarians from developing democracies. She is in charge of the library’s home page, Guide to International and Foreign Law Databases. She also serves as a member of the Index to Foreign Legal Periodicals Advisory Committee.

Initially published by LLRX on February 1, 1999 this article required updating due to the ongoing developments in the field of legal databases. Not only are they appearing at a dramatic rate, but the factors of evaluation are considerably more numerous and more complex.

Overjoyed at the beginning by this new method of obtaining legal documents, users were forced soon to ask questions about the reliability and accuracy of these resources. Analyzing closely electronic sources of legal information specialists applied standards both different and the same as those used for print data. The main concern was to achieve the highest quality of access and information throughout the WWW virtual legal library.1

In his “Introduction to Reference Work”, William A. Katz3 discussed the evaluation process for print reference sources – taking into consideration criteria such as purpose, authority, scope, currency, audience, cost, format, objectivity, annotated indices, and guides to reference. In a comment on “Evaluation of Databases”4 found in the same book, he emphasized that “effective evaluation of databases depends on methods similar to those employed for evaluating printed works, such as purpose, authority and scope”. In addition, he made reference to various formats for storage of database data, how data is accessed, how often the database is updated, “what hardware and software is necessary to make use of the database to its fullest”, and to the speed and efficiency of search. Because of many unexpected factors, these criteria unfortunately proved insufficient, and sometimes actually impossible, to apply to legal databases in the Internet.

The Internet carries both non-professional and professional information. In the former case, a “site” will contain almost any type of data, with Home Pages usually not disclosing either author or publisher, and the source of the information maybe uncertain (for example, neither the reliability nor the authority for the data may be specified ). Reliable legal databases fall in the latter professional category.

A “database” is a collection of data organized in such a way that its content can be easily accessed, managed, and updated. Generally, a database has its own server hosting HTML, PDF, RTF, CGI, etc., files. In practice, we work mainly with legal databases and not with Internet sites. Surprisingly, and in fact, the “Internet” has a pejorative meaning in our profession. In summary, to say that we evaluate legal databases is more accurate than saying we evaluate legal sites on the Internet.

The process of monitoring and evaluating foreign and international legal databases from so many jurisdictions is also a question of content expertise and access, and requires certain skills. They range from weighing and authenticating documents found (to use Richard Danner’s words from his article “Redefining a Profession”5, identifying value-added features, and engaging in establishing database policies, to understanding copyright issues, maintaining links or implementing mirror sites, and dealing with different languages and various formats. Some consider that “source evaluation is an art”6, which can only be achieved by first acquiring an extraordinary level of understanding. As we know, reference librarians for foreign and international law have to work with databases located in far away servers, from different jurisdictions, in many languages, in various formats, and displaying a different kind of thinking as well as different copyright issues – all of which contributes to the complexity of our task.

Though we are in process of creating the full set of necessary standards, there are still guidelines toguide us to the right information, from the right source, at the right time, and in the form most suitable to use. In addition to the evaluation forms I have designed and use on a daily basis, and which are available at the Home Page of New York University Law Library – Guide to International and Foreign Legal Databases, I would like to bring to your attention a more comprehensive checklist for evaluating foreign and international legal databases.

Product Description and Main User Identification

Database name; Database subject; Database address; Date of evaluation; Name of the person evaluating and monitoring the database; Official citation; Database contact person.

Despite the fact that databases are cataloged today, not all of them are found in online catalogs and not all of these elements are included in the online records if they exist. In this transitional and still troubled period of time, we need to watch carefully the above stated elements and act promptly. The address of a foreign or international database could change as well as its name and features. New content could be added. Coverage could change and also the language (abstracts in other languages than the vernacular or even full interfaces in other language than the official one might appear over night). In the case of fee-based databases with only one available password, this information needs to be shared with all other potential users. In this category fall databases such as The Inter-American Database for International Trade (NATLAW at http://www.natlaw.com) with one password, no IP access, and very dynamic changes in the content. In fact, all international databases are loading old content continuously, build up mirror servers, and migrating to more accurate URL addresses consistent with the content and server requirements.

Content Completeness

Is this a full text database? Or, index and abstracts? Or, index only?

Many databases are still under construction or in the process of expanding their data. A relevant example is the European Union database in which new data is added nearly every month. Even though non-full text databases are not so acceptable, an index or abstract can often be very helpful in obtaining the citation to an official gazette and locating the full text of a law in electronic or print format from a foreign jurisdiction. Fee-based databases are not always available at our discretion, but they may nevertheless be valuable tools because of their free-to-see indices. Foreign databases displaying a Summary or an Index of laws should be taken into consideration. As illustration I would cite:

Magnus, a comprehensive fee-based Danish legal database, in the vernacular – with prices ranging from $300 to $450 per month, which allows free access to its valuable daily index of legal news.

TAXLINE – Offers daily tax news and daily summaries of Italian laws, decrees, and presidential ordinances from Gazzetta Ufficiale (Italy Official Journal). Because of the huge amount of every day changes in Italian law, such an Index is a blessing.

GLIN, a fee-based database, permits to search its content. Citation to Official Gazettes or other foreign legal documents are always of a great help. Unfortunately, GLIN doesn’t give us any idea about the data coverage by jurisdiction.

In the case of foreign and international substantive law published by legal databases on the Internet, the ideal situation is that both author and publisher are the same entity: as an example – a government, parliament, department of justice, international organization, or a legal institute working under government supervision. A fee-based database of a private vendor providing guarantee of authenticity of legal documents could be viewed as preferable to a database that may be free but does not provide such guarantee. The quality of the publisher is a guarantee in “creating quality controlled collections”.7 In the field of Internet databases, publisher and author are many times mainly the same. This is an interesting change in comparison with printed legal documents. Another move is to see printed services relying or referring to online databases. A recent example is the Handbook of WTO/GATT dispute settlement by Pierre Pescatore and published by Transnational Publishers, Inc. On July, 2000 a statement from the editors said that “ in light of enormous number of pages of findings in recent WTO Panel and Appellate Body Reports, the editors of the Handbook have expanded their Summaries of the Panel and Appellate Body Reports and direct readers to the WTO Web site under “trade topics” – “dispute settlement” – for the full text of the decisions”.

There are situations when questioning closely the Webmaster of a database could be helpful. This was the case with The South African Cyber Treaty Series, by Arnold Pronto. To be sure about the reliability of the database, I directly asked the Webmaster about his credentials, source of data, etc. His answer was satisfactory, so I included his database in my annotated guide to South Africa legal resources.

What is the Source of Data/Source Verification

Is there a guarantee of content authenticity, e.g., electronic signature on imaged pages – such as at:

A scandal on Wall Street concerning Emulex (http://www.thestandard.com/article/display/0,1151,18995,00.html), in August, was related to sensitive information published by Internet portals that mislead shareholders, and involved millions of dollars in loses. This event brought to international attention the hot issue of source verification. The upcoming series of law suits will certainly enforce this criteria as a requirement for reliable commercial and news databases. The evaluation of databases has to follow general guidelines, applied to any databases on the Internet.

Does the database indicate the page of the printed document?

More and more databases are trying to offer the page of the printed legal document, the citation to the official code, gazette, reporter. The German database REFACT (http://www.refact.de/rda_inh.htm) is one of them. UN Treaty Collection (http://untreaty.un.org/) displays the image page of the UNTS series. In Canada, QUICKLAW (http://www.quicklaw.com/) and a new database recently launched by eCarswell – LAW.PRO (http://ww1.ecarswell.com/law.pro), are fighting for the rights of pagination of the Carswell Law Reports. eCarswell also announced that will offer legal documents in html and pdf (the imaged page of the reporter with the printed pagination).

Do we have the imaged version of the official page in “pdf” or other formats?

The ideal, of course, is to obtain the original version in the vernacular in pdf format (image of the original text) together with the translation in any other language (English or French or Spanish would be preferred whenever Chinese or Japanese or Russian are the language of databases). This is often hard and expensive to acquire – though some international organizations such as the UN, supranational organizations such as the European Union, governments such as the German and Swiss are already providing the data in this way. Nevertheless, it is preferable to have the original version of a legal document, instead of an English translation which may lack any guarantee of accuracy.

Language

Vernacular? Is there an English, French, Spanish, German abstract or an abstract in a well-known language?

I always compare online documents with the printed ones. This is an exercise that helps me see the differences or even the gaps. Gene L. Wilkinson emphasized the importance of objectivity and the avoidance of “indications of careless or hasty preparation”8. D. Scott Brandt considered as an important point of source evaluation the ways in which “information gets filtered”9, written and/or issued by an authoritative source (federal government, reliable organization); authenticated as part of an editorial or peer review process by a publisher; evaluated by experts, reviewers or subject specialists/librarians as part of a collection development. Aimee Glassel10 highlighted the importance of selection criteria to filter resources and recommended to “Start with a site recognized for its authority; look for references to new resources; evaluate new resources as either a potential site to review, or as a new source for further scouting”.

In any case, it is very worthwhile whenever possible to obtain the views and opinions of scholars from different countries concerning their own country’s databases. INT-LAW and EURLEX listserv members are of a great help in maintaining the critical review process.

Currency

Currency is an important consideration, since Internet databases have an advantage over print sources. Currency means answering to questions such as:

Does the organization have a commitment to ongoing maintenance and stability of the resource?

Currentness of links; The percentage of the dead and misdirected links.

Some work is timeless, explains Robert Harris, like the classic novels and stories; other work has a limited useful life because of advances in the discipline, and some work is quickly outdated – as in the field of common law. Therefore, many times we must decide whether the database information is still of value, and of how much value. Harris stressed that “An important idea connected with timeliness is the dynamic, fluid nature of information and the fact that constant change means constant changes in timeliness. The facts we learn today may be timely now, but tomorrow will not be.”

Often a document includes the date when the information was gathered, or clearly refers to dated information, to a “publication date or a last updated date”, or to a date of copyright.12

To determine the date of last updating of a database or of a legal document is one of the most difficult tasks – especially when dealing with civil law jurisdictions where no date is displayed on the Home Page, and sometimes the date refers to anything but to the question of updating the legal content. In many cases, even if Webmasters do not care to inform us about this issue, the machines themselves can be very informative – because they can neither lie nor ignore this problem due to their built-in software. The Hypertext Transfer Protocol (HTTP) is the language which Web clients and Web servers use to communicate with each other. The server can “tell” about the currency of data whenever there is a doubt related to a database or a document. Here is an example.

In trying to determine when a database was last updated, it is wise to use the HTTP protocol. Whenever “talking” directly with the server we have to operate through telnet within the personal email account. Otherwise the server cannot be approached. You have to identify yourself in order to have access to a server data. This is what I have done when I tried to find out the updates of Confederatio Helvetica (http://www.admin.ch), the Swiss government database:

First of all, I went to my telnet port is2.nyu.edu, logged in, went to “Network” and from here to “Telnet”. The technique is to omit http:// from the beginning of the URL but to add 80 at the end of the address, which is the standard port for any server in the world.

1. At the prompt: telnet% Enter the name of the host you wish to connect to www.admin.ch 80Press Enter Trying 193.5.216.31… Connected to adminsrv.admin.ch. Escape character is ‘^]’.

2. At this prompt type the following command (^ means leave a space!):HEAD^/^ HTTP/1.0

The server notified that the French directory of the database was updated in the same day with the entire database. Also, comparing my previous attempts to find the same information, the server told me that the database is updated only once a week, specifically on Wednesdays. The shortcoming of this technique is that if the server indicates the date of change in the physical file, this may not reflect the currency of information13. Also some pages are updated automatically in databases using artificial intelligence. The date is updated every time any change is made to the file, no matter how insignificant – as for example adding a comma.

Another way to investigate how current a web site is concerns the occurrence of the dead and misdirected links. The bigger is the percentage, the less reliable the database. The age of links is also important. Links that point to out-of-date databases shows that the database in question is not credible.

Coverage

Dates covered

One of the major problems in evaluating Internet full text databases is that the records are very dynamic. The scope and the coverage can change overnight. A FAQ file or a description of a database should be visited once in a while. Databases cover sometime in full text only the last few weeks of a legal publication, such as in the case of French Official Journal (official version), or are trying to be complete, such as the International Court of Justice database. ICJ is continuously loading legal data. As of August 2000, the database contains all contentious cases and advisory opinions referred to the Court since 1946. The United Nations ODS announces formally that the full text of documents dating back to 1992 is accessible in Portable Document Format (PDF) in all official languages of the United Nations – Arabic, Chinese, English, French, Russian and Spanish. Documents are stored in two databases, UN Documentation, which includes documents back to 1992 and UN Resolutions, which includes resolutions of the General Assembly, Security Council, Economic and Social Council and Trusteeship Council since 1946.

Archiving

How long will the provider keep data?

Unlimited time?

Limited cumulation?

Will the data be archived by another provider? Who?

Do the database states archival responsibility?

Do the database permits to make/obtain digital copies of content for archiving and for use in perpetuity?

For example, EUR-Lex keeps E.U. Official Journal issues for only 45 days after the date of publication. Afterward, the data can be found in EUDOR. EUDOR informs users that the Official Journals of previous years (back to 1952) will gradually be made available from the time the site is made public (as of August 2000, EUDOR is a fee based database). While Norway’s legal database Lovdata offers Supreme Court decisions shortly after release by the Court, and each decision will remain available for about two months only. Even though some databases are great and we are using them fully relying on them, there is not archiving commitment there. CELEX as well as WTO do not mention anywhere the archiving commitment which has to keep us aware of the possibility of disappearance of these databases. In order to prevent the trouble in such an event, we have to have always alternative parallel sources.

Search Quality/Findability

Is there a link to an external search engine or is there a search engine embedded in the database?

Serena Fenton rates “Search Engine Intelligence” as one of the most important criteria in evaluating a database.14 Actually, findability becomes important not only for newly users but for the experienced ones too15 . The ability to find quickly the required document is a big advantage. There are databases such as GLIN and READEX (Index to United Nations microfiche system) where searching is becoming a hassle. In those cases, only the right title or citation to the legal document may help you to locate it. In the same category was ODS. The database Dialog with its users and the openness and willingness to change things made finally ODS a product where findability could be rated as very good.

Another issue related to the findability is the terminology16 used for services. European or Asian databases use different names we are used to for labeling the same services. Linguistic “obscurity” coming from different legal concepts or from word by word translation into English of foreign legal concepts could be a problem of which the information specialist should be aware.

Workability

Alastair Smith17 launched this term related to the resource convenience and its effectiveness to use. Workability can include:

a logical manner to locate the resources/ the site navigation/the site legend;

The Eur-Lex site map is outstanding. A “Site Legend” is a “website feature that contains the descriptive documentation information essential for making an initial assessment about the website and the worth of its content.”18

a good organizational scheme to help (e.g., by subject, format, chronology, etc).

A perfect example of this kind of difficulties could be experienced at the Italian Supreme Court database – Corte Suprema di Cassazione, a very frustrating to work with fee based database that costs $1,500/year for one password. EasyFind is the software that has to be downloaded in order to access the database. The downloading could be possible only with a special password, which is different from the one given for accessing the database. The browser and the computer configurations have to be changed whenever accessing the database. The downloading process and the connection to the database took me more than one week and back and forth emails to Italy. Special assistance from the Library’s computer support was needed. Any information regarding the database was only in Italian. The database handbook, written in 1996, was not covering all user questions. The database has to be adjusted to the new characteristic of the electronic environment in order to be world wide accepted. The difficulties in accessing the database are high and the required configurations are time consuming and primitive. LexisNexis Butterworths Canada acquired Quicklaw in July 2002.

In order to access their documents, a lot of databases require software applications (plug-ins) such as Adobe Acrobat Reader, Shockware, Real Audio Player, and others. As a courtesy, some webmasters are providing links to databases containing plug-ins easy to download. But not all necessary software is free. Others come with the package. This is the case of JUDIS containing all reportable judgments of the Supreme Court of India from 1950. From 1950 to 1998 they are on the CD-ROM. From 1986 to date, they can be obtained online. A special device is needed in order to access the CD-ROM. Asian Suite is a software for accessing (reading) Chinese and Japanese databases.

Stability/Server Reliability/Malfunctions

Is there a mirror site? Does database use fluctuate during different times of day?

A study from the Internet Research Group indicates that 85 percent of the largest Web databases will have multiple hosts by the year 2001.19 As the Internet grows, the practice of having two or more hosts for the database is becoming more prevalent. This is a procedure for speeding the delivery of content to the end users (downloading) and also a measure to improve managing database traffic.

Servers have a limited numbers of entries which means that there is a limited number of users able to access the server in the same time. An example of extreme congestion is seen at the CELEX database. CELEX doesn’t have its own server, being hosted by the European Union database. CELEX is a directory of the EUROPA server. This situation makes CELEX practically unapproachable in Europe from 9 AM to 5 PM because of the enormous number of users in the same period of time.

One may be better off to use more than one location such as in the case of CISG (United Nations Convention on Contracts for the International Sale of Goods). The CISG Online Project is maintained by the Institute of Foreign and International Private Law at the University of Freiburg, Germany. It provides the status of the convention, text, cases, bibliographic references and German case law on the United Nations Convention on Contracts for the International Sale of Goods (CISG). There is also the International Trade Database at Pace University CISG-UN Convention on Contracts for the International Sale of Goods which has similar content. Another example is the International Court of Justice official database with two mirror servers: the Cornell mirror database and the French mirror.

Open to any kind of problems, malfunctions is a chapter that lists any possible errors encountered by the searcher. They could range from the server malfunction during some periods of time (European databases have their pick hour problems as well as United Nations databases), to the fact that similar searches in the same database in the same period of time bring different results. As Karen Diaz remarked in her above mentioned article on “The best of the best”, a great database that cannot be accessed “is as useful as a great reference book that is nowhere to be found”.

Interactivity with the User

Are there interactive files such as forms for queries, comments, requesting documents and ordering publications (using cgi scripts, perl programs)? Are they helpful? Is there a user support?

A clue of how developed services are in establishing web-based relationships with end users is the length of time it takes to respond to an email message submitted to a database. Speaking from my own experience, I developed in time good relationships with databases such as WTO based in Geneva, SAIJ based in Argentina, Wettenbank from The Netherlands, Hong Kong and China legal portals, AUSTLII located in Australia. The time lag has to be taken into consideration when dealing with overseas databases.

Cost Over Prints Format

Are there hidden costs?

Other costs that are not hidden?

How much cost is attributable to:

licensing of the content?

providing access?

My belief is that a database should cost no more than its print counterpart, and that the ideal is to maintain one paper copy in addition to the electronic version. However, in this transition period, print may cover more than databases, so we may be paying a lot for a database that is still incomplete or growing, even though reliable (UN ODS). But the database has the advantage of being more timely than the printed service. After all, it is preferable to have this kind of policy for two reasons – firstly, because electronic databases represent the future format in which legal documents will exist, and because today electronic databases represent an excellent tool for accessing the most up-to-date legal information.

There are different approaches in pricing policy for almost every fee-based database. Many of them offer a subscription for a number of records and require other payments for any additional record, such as in the case of Chinese CEIlaw database. European databases such as the French Jurifrance and The Netherland Wettenbank will offer an amount of information to be seen and downloaded for a flat fee, and charging for any additional use. This causes an enormous bureaucracy. I receive from Europe receipts whenever I see a document from the above mentioned databases. LAWSITE from New Zealand charges per view, per search, per Data Services Products such as Recent Decisions – $5.00 (PDF format); Environmental News Bulletin – $5.00 (PDF format); Environmental Digest per section – $3.00 ; per statute – $20.00 ; per PDF decision downloaded – $15.00, etc.

Copyright Stipulations

Information related to the possibility and the degree of using legal information published by a foreign or international database is crucial. Usually, a database has to maintain a file listing the permission to reproduce and the protocols to be followed in such a situation. Some databases want to be sure that aggressive users will not download huge amount of information, so the downloading is restricted to a certain amount of information. On the other hand, free databases want to have control over users so the registration may be necessary or a request for the permission to reproduce.

Does the database permit fair use of all information for non-commercial educational, instructional and research purposes by authorized users, including unlimited viewing, downloading, and printing?

Is the number of simultaneous users limited?

As we know, licensing agreements have become more prominent in the electronic era. A licensing agreement should cover those circumstances which a content owner and user/purchaser agree upon for the use of certain specified content in a digital environment. First, we have to understand the nature of licensing content. After that, we have to decide what is the appropriate type of arrangement.

There is no consistency in the way this matter is handled . TAXBASE accepted multi-users for one password. A single annual subscription gave unlimited access to all tax news and documents to a tax professional in the company – which meant that one “tax professional” could make available his password to more than 10 persons working for him. But things changed. TAXBASE sells today portion of the database for a certain amount of money. For special multiple user subscriptions we see different prices. By contrast, ODS and CELEX only allow one access per password at a time while the United Nations Treaty Collection, and European Journal of International Law allow IP access.

“The law governing electronic commerce, software licenses, and published information content over the Internet is in larval stage both in the United States and around the globe”.20 This reality makes difficult sometimes the task of dealing with cross-border procedures and restrictions, norms of foreign jurisdictions, and even rejection. Some databases reject any servers from outside the country of origin. There are databases that don’t allow a foreign user to view the content and don’t want to sell to foreign users their content. NACSIS-IR is a Japanese provider that sell databases only in Japan. Japanese Laws in Force is only for “domestic service only” as well as other databases.

Structural Considerations/Database Performance

The record structure and the index structure of databases with built in search engines tell about the structure of the database.21 The evaluation has to estimate the record structure and the subject hierarchy structure. Greg R. Notess recommends as a technique the comparison of the subject structure to the traditional thesauri. In databases with powerful search engines, the full text stored in the databases is available through search engines results. The retrieved records are displaying extracts of the full texts so the searcher doesn’t know which are the searchable fields as in Lexis, Westlaw, Dialog. The retrieved records are his only guidance. Because the indexes of internal search engines cannot be browsed and the search engines are not offering the full record, the searcher has to know in depth how the database search engine manages the search.

The duration of transactions22 (search and content delivery) is another point to be considered especially in Europe where the Internet connection is very expensive. This is a key measurement of the database performance.

Conclusion

It is becoming clear that the future in providing legal information will be in electronic format. Therefore, it is critically important to establish from the outset clear standards for publication over the Internet. They have to be known by Internet databases publishers. The purpose of this paper is to define standards, in spite of the fact that further work remains to be done in developing them for international use.

The information and research environments make the evaluation process a problem of content, access, point of view, server and database configuration. The evaluation depends on our patrons’ needs, the realities of our workplace technical and legal resources. The main concern in the evaluation process is the Quality of Information found in the Internet and the ways of accessing it. Criteria of evaluation are as well starting points in building electronic resource collections, developing Internet-based information services, or instructing users in the effective use of digital information.

Sabrina is also Researcher/Author of
beSpacific® - Accurate research surfacing documents and resources focused on law, technology, government reports, and knowledge discovery - with a global perspective. Updated daily since 2002 with a searchable database of 40,000 postings.