Tuesday, June 13, 2017

#Wikidata some assertions

Wikidata is no different from any community, there are differences of opinion. Everybody has his or her own perspective but there are assertions that can be made that have a more universal resonance.

The assertions below represent the underlying arguments I use in my blog posts and in the discussions I take part of. They are the ones I feel are not necessarily "political" or have a negative impact.

Thanks,

GerardM

There is no data store without problems, this includes Wikipedia and Wikidata.

The data we hold is best understood by applying set theory. The data in Wikidata consists of many subsets; probably the most valuable subset for the WMF are the interwiki links.

The error rate in each subset can be assessed and is by definition different from the overall Wikidata error rate

The absence of data often indicates a bias in the data Wikidata holds. A good example is the lack of data relevant to the global south.

Given the huge influx of data from Wikipedia, the biggest imports have been from English Wikipedia and it is one reason for the existing biases in Wikidata.

An absence of data prevents the application of tools. Tools may suggest writing a Wikipedia article, tools may compare data with other sources.

Concentrating on the differences between Wikidata and any other data source is the most optimal way of improving the quality of existing data in either data set.

Having an application for the data in Wikidata is the best way for improving the usefulness for a subset of data.

Each contributor to Wikidata works on the data set(s) of his/her own choice, these data sets interact in the whole of Wikidata. This may raise issues and this can not always be avoided.

Examples of problematic data must be seen in the light of the total of the data set they are part of. Statistically they may be irrelevant.

Never mind how "bad" an external data source is, when they are willing to cooperate on the identification and curation of mutual differences, they are worthy of collaboration

Wikidata improves continually and as such it is "purrfect" but it will never be perfect.