It's a blog

Tag: mediawiki
(page 1 of 2)

The RevisionSlider is an extension for MediaWiki that has just been deployed on all Wikipedias and other Wikimedia websites as a beta feature. The extension was developed by Wikimedia Germany as part of their focus on technical wishes of the German speaking Wikimedia community. This post will look at the RevisionSliders design, development and use so far.

The refactoring started as part of [RFC] Expiring watch list entries. After an initial draft patch was made touching all of the necessary areas it was decided refactoring first would be a good idea as the change initially spanned many files. It is always good to do things properly ® instead of pushing forward in a hacky way increasing technical debt.

The idea of a WatchedItemStore was created that would remove lots of logic from the WatchedItem class as well as other watchlist database related code that was dotted around the code base such as in API modules and special pages.

As with last year the event took place in the Mission Bay Center, San Francisco, California. The event was slightly earlier this year, positioned at the beginning of January instead of the end. The event format changed slightly compared with the previous year and also included a 3rd day of general discussion and hacking in the WMF offices. Many thanks to everyone that helped to organise the event!

I have an extremely long list of things todo that spawned from discussions at the summit, but as a summary of what happened below are some of the more notable scheduled discussion moments:

This library is the first of the addwiki collection that has actually reached 1.0.0 let alone 2.0.0! All of the other libraries, including wikibase-api and mediawiki-api are still a work in progress with lots to be added. The next likely to be released will be the wikibase-api library once I try to also add async functionality there!

A snippet of the async functionality added in mediawiki-api-base can be seen below:

Recently I have been spending lots of time looking at the Wikimedia graphite set-up due to working on Grafana dashboards. In exchange for what some people had been doing for me I decided to take a quick look down the list of open Graphite tickets and found T116031. Sometimes it is great when such a small fix can have such a big impact!

I don’t mean Mediawiki is crap! The Change Risk Anti-Patterns (CRAP) Index is calculated based on the cyclomatic complexity and code coverage of a unit of code. Complex code and untested code will have a higher CRAP index compared with simple well tested code. Over the last 2 years I have been tracking the CRAP index of some of Mediawikis more complex classes as reported by the automatic coverage reports, and this is a simple summary of what has been happening.

So it turns out the at() method doesn’t quite do what I had initially thought….

I have recently been working on some tests for the new Newsletter extension for Mediawiki, specifically to test the NewslettterTablePager class. This thing extends the TablePager class in Mediawiki which is designed to make displaying information from a database table on a special page on a mediawiki site easy, and also easily enable things such as sorting.

The code interacts with the database and gets a ResultWrapper object, and the Pager uses the numRows(), seek() and fetchObject() methods, all of which I thought would be incredibly simple to mock.

Attempt 1

My first attempt where I first notice I have been thinking about the at() method all wrong can be seen below:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

privatefunctiongetMockDatabase(array$resultObjects){

$mockResult=$this->getMock('ResultWrapper');

$mockResult->expects($this->atLeastOnce())

->method('numRows')

->will($this->returnValue(count($resultObjects)));

$mockResult->expects($this->any())

->method('seek');

foreach($resultObjects as$index=>$resultObject){

$mockResult->expects($this->at($index))

->method('fetchObject')

->will($this->returnValue($resultObject));

}

$mockDb=$this->getMock('IDatabase');

$mockDb->expects($this->atLeastOnce())

->method('select')

->will($this->returnValue($mockResult));

return$mockDb;

}

This methods returns a mock Database that the Pager will use. As you can see the only parameter is an array of objects to be returned by fetchObject() and I am using the at() method provided by phpunit to return each object at the index that it is stored in the array. This is when I discovered that at() in phpunit does not work in the way I first thought…

at() refers to the index of calls made to the mocked object as a whole. This means that in the code sample above, all of the calles to numRows() and seek() are increasing the current call counter index for the object and thus my mocked fetchObject() method is never returning the correct value or returning null.

Attempt 2

In my second attempt I made a guess that phpunit might allow multiple method mocks to stack and thus the return values of those methods be returned in the order that they were created. Thus I changed my loop to simply use any():

1

2

3

4

5

foreach($resultObjects as$index=>$resultObject){

$mockResult->expects($this->any())

->method('fetchObject')

->will($this->returnValue($resultObject));

}

But of course this also does not work and this result in the same $resultObject being returned for all calls.

Final version

I ended up having to to do something a little bit nasty (in my opinion) and use returnCallback() and use a private member of the testcase within the callback as a call counter / per method index:

1

2

3

4

5

6

7

8

$testcase=$this;

$mockResult->expects($this->any())

->method('fetchObject')

->will($this->returnCallback(function()use($testCase,$resultObjects){

$obj=$resultObjects[$testCase->mockSeekCounter];

$testCase->mockSeekCounter=+1;

return$obj;

}));

Notes

It would be great if phpunit would have some form of per method index expectation!

Rawmode was a boolean value used to determine if an API result formatter in Mediawiki needed extra metadata in order to correctly format the result output. The main use of said metadata was in the XML output of the Mediawiki API. How hard can removing it be? This is the story of the struggle to remove the use of this single boolean value from the Wikibase codebase.

Overview

The first commit for this task was made on the 6th July 2015 and the final commit was about to be merged on the 27th August. So the whole removal took just under 2 months.

During this two months roughly 60 commits were made and merged working towards removal.

Overall 9290 lines were removed and 5080 lines were added.

I’m glad that is all done. (This analysis can be found on Google sheets). Sorry there are not more pictures in this post…..

Reason for removal

Well, rawmode is being remove from Mediawiki to remove API complexity. Instead of having to check what the API formatters need they will instead just accept all metadata and simply use what they need and discard the rest.

The change to “Finish killing ‘raw mode'” can be seen on Gerrit and has been around since April of this year. The relevant task can be found on Phabricator.

Process overview

The first step on the path was to remove the old serialization code from Wikibase (otherwise known as the lib serialization code) and replace all usages with the new WikibaseDataModelSerialization component. This component was already used in multiple other places in the code but not in the API due to its reliance on the way the lib serialization code handled the rawmode requirement of the API at the time.

Removal of the lib serialization code was the the first of the two major parts of the process and after around 50 commits I managed to remove it all! Hooray for removing 6000 lines with no additions in a commit…

The next and final step was to make the ResultBuilder class in Wikibase always provide metadata for the API and to remove any dirty hacks that I had to introduce in order to kill the lib code. Again this was done over the course of multiple commits, mainly adding tests for the XML output which at the time was barely tested. Finally a breaking change had to be made to remove lots of the hacks that I had added and the final uses of raw mode.

The final two commits can be seen at http://gerrit.wikimedia.org/r/#/c/227686/ and http://gerrit.wikimedia.org/r/#/c/234258/

MassAction is a Mediawiki extension that allows users to perform mass actions on targets through a special page making use of the job queue. Its development started at some point in 2014 and a very rough experimental version is now available. Below are the basics.

Basic Concepts

Tasks are individual mass actions, comprised of smaller actions that are applied to multiple targets using matchers, for example replace the word ‘hello’ with ‘goodbye’ on all wiki pages with a title that contains the word ‘Language’ for pages that are not redirects.

Actions are processes that can be applied to Targets to alter some of the data that they contain, for example, change the title (move), change an article text etc.

Matchers or Filters are sets of rules that are used to match certain targets, for example all articles that contain the word ‘hello’.

All of these concepts are stored in new database tables (seen below).

Appearance

The main interaction with the extension is done through a special page. This page allows the creation of tasks as well as the viewing of previously created tasks and various actions such as saving changes.

A wire frame showing task creation can be seen below This allows for basic information about the task such as what type of target we want to change, this could be an Image, Article, Wikibase item etc. It also allows for a summary of the changes that will be made.

The lower sections of the page allow for the input of an unlimited number of Actions and Matchers/Filters to be added.

The version of the special page that allows users to view tasks is slightly different and can be seen below.

The main differences here are that no new data can be added, it is simply presented. And also a Task state and list of targets is now present.

Upon creation of a Task the Task will make its way through various states (seen below).

Once the targets have been found they will appear in the targets list on the special page and users will be able to either save changes individually or save a whole list of changes.

Current State

The code is currently stored on Wikimedia’s Gerrit tool and is mirrored onto GitHub. All issues are now tracked in Phabricator and the current workboard can be found here.

A screenshot of the current special page for task creation can be seen to the right.

Of course at this early stage lots of things are missing and I hope I find the time to work on this over the next year: