User:Multichill/Commons Wikidata roadmap

This page describes the Commons Wikidata roadmap as I see it. It may change over time. It contains several items sorted in the order I think it will happen in time. The actual order of deploying them might change.

Almost every Wikipedia has Commons category template linking articles and categories to their equivalent at Commons. This is the same problem as with the interwiki's: We have to maintain it on every Wikipedia and that's quite a lot of work. To centralize this a property Commons category (P373)‎‎ has been created. This is a property to ease the transition and it will disappear over time.

Bot should be used to keep everything here consistent: If an item A has a property "Topic main category" linking to a (category) item B, the interwiki link on item B should be exactly the same as the Commons category (P373)‎‎ claim on item A.

Mind you that Commons category (P373)‎‎ is still used in Wikipedia because getting the interwiki from a linked item is not possible yet.

We want to move away from the current category system to a better system, but we don't want to throw away years of work. We need to model intersected categories on Wikidata to make it easier to move. An intersected category is a category that intersects two or more topics. Take for example Category:Churches in Haarlem. This is an intersection of Haarlem (Q9920) and church (Q16970). We need to model this because at some point in time we want to add these claims directly to images (probably with a bot) and this also helps with the search engine.

The images with the connected topics should be fed to a search engine. How the different categories and items relate to each other should be fed to the search engine too. Faceted search would probably be a really nice replacement for the current static category system. We should build this as a prototype (probably on labs) so we can get the feel what works (and what not) without disturbing the current system.

Tracking is where something is used and purging is regenerating the page because something has changed. Now this is done for templates. Take for example {{Information}}. The usage of this template is tracked here. Say I change the color of the template all the pages using this template have to be regenerated to show {{Information}} with the new color, this is the purging part.

The current file pages are a mix of data and representations of that data. With data being moved to the Wikibase repository we can focus on a complete redesign. Assuming we have almost all data in a structured form, how would we design the new pages? We should completely rethink the way we show the information to the users. Mock ups should be made, community discussion should happen, etc.

Preferably we should just have one template on the page that shows what is needed. So no {{Information}}, {{Book}}, {{Artwork}}, {{Creator}}, {{Institution}} and license tags. The template should give the right representation based on the claims added to the file. We should of course use the lessons learned from these templates in the new system.

Up until now we had one instance of Wikibase (the one on Wikidata). Storing the data about every file on Wikidata is probably not a good idea so a local instance should be deployed to store that data. The two Wikibase instances should work together and we should avoid duplication. We probably have very few items here, because for most things we would just link to Wikidata items. We'll have an object for every file and for every user. We'll have a number of string type properties which can be upgraded to item type properties. Say for example a painting made by Rembrandt might first have a author string based claim "Rembrandt" and this can be changed to a author claim linking it to Rembrandt (Q5598). This is how we work with {{Creator}} templates now.

We should be adding claims based on the current categorization so we can slowly start replacing that.

We should probably focus on getting this working for new uploads, than focus on important files and in the end, based on experience, do the bulk of the files. This is going to involve a lot of bot work.