Sunday, December 21, 2014

The full fledged mobile computing devices are far more common these days than the desktops had ever been. The websites are delivering services more to mobile devices than the desktops. Google now claims to get more searches through the mobile devices these days.

Today most websites will ask you to download their app and install on your mobile device. With package managers like "Google Play Store" people find it extremely easy to do so. Just with a single click and a disclaimer of all possible privacy violations, people install the mobile app.

It is strange that html browser that has been the de facto standard for delivering content and untrusted code has been swiftly replaced by the mobile app. Browsers provide security from untrusted code as the exposed scripting language like javascript is managed code wih limited functionality. In cases where people install "native code" like Adobe Flashplayer, people are at mercy of the "severe bugs" in flashplayer that keep appearing every week. No doubt today that social engineering attacks targeting vulnerable flash player has been quite common on facebook. Javascript engines never had that many issues.

If installing any "native code" is so dangerous, the obvious question is how come folks are installing mobile apps from untrusted sources without any second thoughts? Some would argue that Android provides a Java framework and so all apps are written in Java and so there is little chance of having vulnerable programs. That is quite untrue as apps are allowed to bundle native code that they can be invoked through the JNI interface. Bugs in native code could be exploited by any other untrusted app.

Beside security, mobile apps have also turned the head upside down on traditionally asynchronous applications. For example, I used to open up a mail app to check for new mails. With the mobile apps, it will beep you everytime you get a mail. And yes not all mails deserve my attention immediately. So converting asynchronous applications into synchronous ones provide little utility and very annoying beeps.

Some people believe searching the app is faster than typing it on browser. But today's browsers provide easy navigation by using bookmarks and clickable thumbnails. So that argument is also spacious.

Lastly, there are many sensors in a mobile device (like GPS, compass etc) that can be used by a mobile app to deliver better services. While HTML 5 has incorporated some of these sensors, that remains the perhaps only legitimate resign for installing a mobile application.

Saturday, October 4, 2014

I got a mail from Raghav Das saying that a judgment from Supreme Court related to cow protection State of Gujarat Vs. Mirzapur Moti Kureshi Kassab Jamat & Ors 26th October, 2005 is missing. There is only a dissenting opinion from Justice A Mathur available on the website. The main opinion was indeed missing.

In trying to avoid duplicate judgments from the court, the Indian Kanoon crawler missed downloading the opinion from the main bench. The bug has been fixed and Justice Lahoti opinion for the main bench is available here The combined judgments are also available now here.

Thanks Raghav for reporting the problem! It exposed an important bug in the system and the entire SC website will be re-crawled to fix any other missing judgment.

Tuesday, September 16, 2014

A new release was rolled out on Sunday in an effort to make Indian Kanoon the gold standard for legal research in India. The release consisted of a lot of user visible changes and includes all changes that were planned in July. While some of these changes are related to improving the infrastructure, most of the changes have come from people complains and the problems they have been facing while using Indian Kanoon. Here are the broad list of changes:

1. Removed duplicate judgments: Many court websites in India have separate urls for each case number even when these cases are combined and only one judgment is delivered. The new release ensures duplicate judgments are filtered out.

2. Improved the judgment layout: The new release removes page numbers and case numbers that are sprinkled in the judgment. Also new code has been developed to identify paragraphs, quotes and tables so that judgment could be laid out in an appealing format.

3. Improved PDF copy: htmldoc is enhanced to generate PDF output using a more readable Georgia font using the approach here.

4. Consumer Court judgments added: Roughly 1 lakh (100K) judgments from National and State Consumer Redressal Commission have been added to the Indian Kanoon database. New cases from these commissions will also get updated every day. For example look at the consumer cases against airtel here.

5. Improved titles for Bombay, Kolkata, Andhra and Kerala high court: Since these courts do not provide meta information, the petitioner and such details are extracted from the free text. Earlier there were a lot of error in these extractions.

6. New Design: A new center based design was rolled out on Sunday. Do let me know of any usability issues you have encountered in the new design.

7. Software updated: The entire software stack was updated including the kernel on the production host. For the first time it has been achieved without any downtime or any user experience issues. Also for the first time the production traffic was served on Sunday using a multi-node setup. It is not a user facing change.

Thursday, July 3, 2014

Here is a broad list of changes to Indian Kanoon that I am planning for the next release (hopefully out by July end 2014)

1. Remove duplicate judgment. Many high courts are like Uttarakhand, Jharkhand, Orissa and Bombay are publishing the same judgment on different URLs. Perhaps every case number has a new URL even when cases are combined together in a judgment.

2. Improve the judgment layout. It will involve removing page numbers and case numbers sprinkled in between the text. It will also involve improving the layout and paragraph detection algorithm.

3. Improve the pdf output. Currently IK uses htmldoc for generating pdf output. But it has limitations on the fonts we can use. So plan is to move to Pisa.

4. Add judgments from Consumer Courts at national and state levels. District consumer forums in Maharasthra, Gujarat and Karnataka are using local languages that are using custom embedded fonts in the PDF. As the text is not in unicode, it will require lot more effort to reverse engineer the fonts for any text processing. So district consumer courts will not be added.

5. Improve titles for Bombay, Kolkata, Andhra and Kerala high court. Since these courts do not provide meta information, the petitioner and extracted from the free text. Currently there are lot of error in these extractions and the plan is to improve them.

Tuesday, June 24, 2014

Some of my lawyer friends have told me in past that they need a way to "Search Within Results". Consider an example where a person has filed a "child custody" case and he or she is an accused in a separate criminal trial of "attempt to murder". The lawyer wants to know whether the ongoing criminal trial will have a bearing on the "child custody" case or not. The lawyer searches for "child custody" and then want to only look on those cases that have "attempt to murder" in it. One easy way would be to just search for "child custody attempt to murder". But a search for entire phrase may not get you very relevant result because the phrases are disconnected. So it will be better if we can filter the documents by "custody of child" and then look for most relevant documents by the phrase "attempt to murder".

To support this use case, a new "filter" keyword is added. Now you can search for "attempt to murder" in documents that have "custody of child" by the following search query: "attempt to murder filter: custody of child". In this case the ranking will only be decided by the phrase "attempt to murder" giving better results.

There is a new box provided at the bottom of the page titled "Search Within Results" that takes existing search as the "filter" and then uses new query to rank order the matching documents.

Friday, May 2, 2014

I
am founder of
a legal search engine called Indian Kanoon (http://indiankanoon.org)
Indian
Kanoon provides state of art free search and free access to
Indian
court judgments to the common people.

Indian
Kanoon daily crawls different Indian court websites and adds the
set
of updated judgments for Supreme court and high courts to its
database.
Since court judgments do not have copyright protection,
Indian
Kanoon does not violate any copyright law.

Infact
Indian Kanoon provides just another portal for people to get
access to
court judgments and thereby allows more widespread distribution of
court
judgments. Restricting access to judgments in this particular
fashion
will hinder Indian Kanoon ability to provide access to Patna High
Court decisions and thereby in people to have easy access to court
judgments. Currently Indian Kanoon serves more than 2 million
users
every month.

Indian
Kanoon fills in many voids which exist in current Indian court
websites. Restricting access to judgments also forces people to
stay
with the court websites and force them to not use the law search
tools
provided by other providers like Indian Kanoon. I think providing
unhindered access to court judgments is in the interest of Indian
people
as they can use any research tools provided by any competitive
portal.
If such restrictions are removed, people can choose whichever
website
they like most.

If
the problem was in Patna High Court server getting overload
because of
Indian Kanoon crawling, I would like to state that Indian Kanoon
crawls
the websit eonly once at 12:00 AM IST when there is little chance
of
affecting any normal user on your website. Further, replicating
court
judgments on Indian Kanoon reduces the load on Patna court
servers
as many people can access the judgments directly on Indian Kanoon.

So I would kindly request you to remove captcha restriction on
Patna High
Court website.