User:Maximilian.Klein.LRMI/RfC/FAQ

We are Max Klein of UntrikiWiki.com and Yaron Koren of WikiWorks.com. Max has worked as Wikipedian-in-Residence for OCLC Research where he created VIAFbot. Yaron, a longtime MediaWiki developer, administrator and consultant has recently published a manual on MediaWiki Working with MediaWiki.

The development of HTML Tags is funded by a grant through Creative Commons, the Mountain View-based open tech nonprofit. Creative Commons is responsible for the creation and maintenance of the Creative Commons licenses that Wikibooks (and other Wikimedia projects) use, as well as a bunch of other fun stuff. This project doesn’t have a commercial purpose - Creative Commons' mission (which we share) is to develop and support the technical infrastructure necessary to maximize digital creativity, sharing, and innovation.

What is metadata?

Descriptive metadata is structured information that describes the content that it is associated with - it’s information about information. It can include stuff like how long content is expected to take to consume, what the topic of the content is, who the content is aimed at, and other similar information.

What is LRMI?

LRMI is a joint project that is co-led by Creative Commons and the Association of Educational Publishers (the only professional organization that covers the entire educational resources community.) The standard was developed in an open and collaborative process that made active efforts to try to involve all major stakeholders as well as the general public. The advisory group for the initiative had members from Scholastic, Pearson, Houghton Mifflin Harcourt, Curriki, and McGraw Hill, among others. The technical working group that worked on the project involved many people with relevant expertise, including the head of the Dublin Core Metadata Initiative, and people from Creative Commons, Microsoft, and the Gates Foundation as well as Wikipedian (and UCB professor) Brian Carver. (You can see a full list of members of both the advisory group and the technical working group here.)

What is Schema.org?

Schema.org is a joint project between Google, Bing, Yahoo, and Yandex - four of the world’s biggest search engines that collectively account for more than 96% of all web searches worldwide - that aims to develop a collection of metadata schemas that can be used by webmasters to provide search engines with extra information about their content so that search engines can improve the quality of their results.

Why LRMI and Schema.org over competing standards?

There are competing metadata standards, but to be useful, metadata standards must be supported by tools. We believe that the advantage inherent to being supported by four of the world’s biggest search engines means that, inevitably, schema.org will win out over its competitors.

If we are wrong and another metadata standard eventually supplants LRMI/schema.org, from a technical standpoint, HTML Tags will be able to support other metadata standards with a relatively small amount of modification. Where there are one to one parameter equivalencies, a bot could be employed to convert pre-existing LRMI/schema.org parameters automatically.

Why should we use metadata/HTML Tags?

Adding metadata to Wikibooks would have a couple very beneficial effects in the near-term future. The biggest of these would be increasing the accessibility of Wikibooks’ resources in most major search engines - Schema.org was formed collaboratively by Bing, Google, Yahoo!, and Yandex, with the explicitly stated aim of making it easier for their users to turn up high quality relevant results in their searches. The addition of good metadata to Wikibooks content should improve its accessibility via search engines. It will also improve the preview of Wikibooks content that is shown in search results, since most major search engines use metadata markup to generate better previews where it is available.

As an example of how this could play out, take a look at the Wikijunior book about the solar system - specifically, it’s chapter about the sun. It presents a pretty good overview of the Sun, aimed at elementary school students. Despite the fact that it’s a pretty solid chapter, it doesn’t get very much traffic - only about 220 views a month. This seems like way too little exposure for such a high quality book. After playing with Google for quite a bit, it became obvious that this chapter was not ranked highly on most relevant keywords - the addition of accurate metadata to this book should improve its accessibility via relevant keywords. (You can’t add metadata to low quality pages and expect a meteoric search engine boost, but when you add metadata to pages that already have high quality content, the results can be remarkable.)

We also anticipate that there may be unexpectedly creative uses of HTML Tags to support things other than metadata. Although we cannot guarantee we’ll be able to provide technical support for all such uses, we are certainly excited about them. We’ll support cool side projects where we can, and if something comes up that we can’t support we’ll try to connect you with volunteers with the appropriate skill-sets to progress your project.

What would adopting HTML Tags and LRMI involve?

Does HTML Tags pose a security risk?

No.

The set of tags and tag attributes (which is the set of tags necessary to properly support LRMI and schema.org’s metadata schema) enabled in HTML Tags by default pose no security risk. The allowable set of tags and tag attributes is set in localsettings.php, and can only be modified by Wikimedia Foundation staff and trusted volunteer developers. Additional tags and tag attributes could be enabled in the future (via Bugzilla request) if they end up being desired at a later date. (Additional metadata terms could be enabled via editing the existing template structure without needing to go to Bugzilla.)

How much of a workload would this add to the community?

Adding metadata to Wikibooks would be a gradual process - no one is going to need to go through 50,000 content-space pages and categorize them all at once. Ideally, editors would slowly begin to add metadata to whatever pages they edit. There would be a couple ways to go about doing this.

The first would be to add a template to the page by hand, which would in most situations be {{LRMI-object}}. {{LRMI-object}} is similar in structure to the {{Authority control}} template on the English Wikipedia. Some more information about the templates is found earlier in this document. The second way to add metadata to a page would be by using the LRMI button that we’ve added to the editing toolbar in our demo wiki (which can be found in the editing window three buttons to the right of the button used to italicize text on the demo.)

After the template parameters have been set, the template will, behind the scenes, render the basic template structure into the non-user friendly HTML required to meet LRMI and schema.org’s exact specifications. Once a sizable portion of Wikibooks’ content has been tagged with metadata, there should begin to be noticeable improvements to things like search engine result quality without any additional effort on the part of editors. This improvement is likely to become even more drastic once more websites broadly implement metadata schemes.

What if two editors disagree about the appropriateness of a particular tagging?

We don’t envision a special or standalone system for dealing with disagreements of this sort. Hopefully they come up only rarely, but when they come up, it should be possible to handle them using Wikibooks’ existing dispute resolution system. We anticipate that most tagging should be fairly straightforward and that disputes should be uncommon.

Are there any licensing incompatibilities between LRMI/Schema.org/HTML Tags and Wikibooks?

No.

Schema.org is licensed under CC-BY-SA 3.0, the same license that Wikibooks’ content is normally released under. LRMI’s vocabulary is also released under CC-BY-SA. HTML Tags is released under the GPL, which is a free license (and the license that Mediawiki extensions are typically released under.) All of these licenses are 100% compatible with being used on Wikibooks.