The battle for the future of the Open Web is taking place as a new document model merges into a platform for highly graphical, interactive and information rich applications. Open source communities vie with dominant vendors Adobe, Microsoft, Apple, Cisco, Nokia and Google to stake out their claims as open source innovations collide with standards consortia and proprietary alternatives.

Wednesday, April 23, 2014

The Internet is always changing. Sites are rising and falling, content is deleted, and bad URLs can lead to '404 Not Found' errors that are as helpful as a brick wall.

A new project proposes an do away with dead 404 errors by implementing new HTML code that will help access prior versions of hyperlinked content. With any luck, that means that you’ll never have to run into a dead link again.

The “404-No-More” project is backed by a formidable coalition including members from organizations like the Harvard Library Innovation Lab, Los Alamos National Laboratory, Old Dominion University, and the Berkman Center for Internet & Society. Part of the Knight News Challenge, which seeks to strengthen the Internet for free expression and innovation through a variety of initiatives, 404-No-More recently reached the semifinal stage.

The project aims to cure so-called link rot, the process by which hyperlinks become useless overtime because they point to addresses that are no longer available. If implemented, websites such as Wikipedia and other reference documents would be vastly improved. The new feature would also give Web authors a way provide links that contain both archived copies of content and specific dates of reference, the sort of information that diligent readers have to hunt down on a website like Archive.org.

While it may sound trivial, link rot can actually have real ramifications. Nearly 50 percent of the hyperlinks in Supreme Court decisions no longer work, a 2013 study revealed. Losing footnotes and citations in landmark legal decisions can mean losing crucial information and context about the laws that govern us.

The same study found that 70 percent of URLs within the Harvard Law Review and similar journals didn’t link to the originally cited information, considered a serious loss surrounding the discussion of our laws.

The project’s proponents have come up with more potential uses as well. Activists fighting censorship will have an easier time combatting government takedowns, for instance. Journalists will be much more capable of researching dynamic Web pages.

“If every hyperlink was annotated with a publication date, you could automatically view an archived version of the content as the author intended for you to see it,” the project’s authors explain. The ephemeral nature of the Web could no longer be used as a weapon.

Roger Macdonald, a director at the Internet Archive, called the 404-No-More project “an important contribution to preservation of knowledge.”

The new feature would come in the form of introducing the mset attribute to the <a> element in HTML, which would allow users of the code to specify multiple dates and copies of content as an external resource.

For instance, if both the date of reference and the location of a copy of targeted content is known by an author, the new code would like like this:

The 404-No-More project’s goals are numerous, but the ultimate goal is to have mset become a new HTML standard for hyperlinks.

“An HTML standard that incorporates archives for hyperlinks will loop in these efforts and make the Web better for everyone,” project leaders wrote, “activists, journalists, and regular ol’ everyday web users.”