How can we improve the peer review process?

One of issues which came out of the really interesting discussion about code freeze was the problems of getting peer reviews completed. Having seen how that discussion evolved, I thought it was worth starting another.

To start off with... we are definitely improving! In 2011/12 we did 769 peer reviews, 2012/13 1539 and in the last 12 months 2200 peer reviews have been completed. Much more important than that is that the quality of reviews is improving (at least from my perspective as an integrator). Twelve months ago we were still integrating code which had not been peer reviewed on a regular basis; today it's a rarity. Moodle code has never been as throughly reviewed as it does today.

That said, we continue to have a massive backlog of issues waiting for peer review and we have an even more gigantic pile of patched issues which have not ever been looked at.
This is bad, i'm sure you know, but to reiterate:

The longer an issue is delayed before integration, the bigger the chance it won't be relevant or conflict

Contributors tend to be much more able to respond to review comments when initially submitting code

You can get a new contributor 'hooked' by reacting quickly to their patch (it worked for me!)

Here are some of the causes of this problem, as I see it:

There is an incredible amount of code being created by the Moodle community and it's not as fun to review code as to write it!

In Moodle HQ, we don't quite have the right processs/balance/prioritisation for managing peer reviews (in contrast to Integration Review, where we have a full-time team dedicated to the task). I don't think it is necessary/appropiate for the community to discuss this! I'm just acknowledging I don't think it is a perfect situation.

There is a degree of expectation that Moodle HQ should be resonsible for all peer reviews.

Developers don't feel knowledgeable enough about a code area to peer review many issues, so they ignore it.

Some issues are so big/fundamental/controversial they get ignored because the peer reviewer wouldn't know the next step.

Some issues are so in the wrong direction/quality they get ignored because it is quite difficult to give bad feedback.

Some thoughts of mine:

It could be beneficial to have a way of seperating the big/controversial/hard issues away from smaller ones. With the goal of processing the smaller ones as fast as possible (and getting a small number waiting in the queue). Ideally we'd also tell the developer of the issue that their issue is a trickier one to review and that they would expect delays. My argument against this is that I'm not sure if this triaging would just be moving the problem around.

Somehow I would like to empower you to peer review something you are uncomfortable with, especially if the issue has been waiting for a while. There are many areas of Moodle which are not well understood by many people and by leaving the issue to someone else we don't help solve that. It's better the assignee gets some feedback, even if you are not a domain expert. (Note: this is kind of what the Integration team have to do on some issues on a weekly basis )

First thanks for the efforts of three people (Damyon, Dan, and Marina) in Moodle HQ reviewing my large change for 2.7!

Second, I agree with the 'small change vs large change' and also 'not in my area' vs 'not that complicated anyway' issue. I wonder if there should even be different states or labels, like 'Waiting for peer review (anyone)' or 'Waiting for peer review (HQ)' (or 'Component maintainer' or something not necessarily HQ).

Tim and I often review each others' small changes, hopefully this is useful, but for large changes it can be inappropriate especially as we obviously come from the same institution so are more likely to have the same take on things.

In general while it would be great to have community contributions in peer review it seems harder to achieve especially for those of us who are doing this as part of our regular job. Say there is a simple issue waiting peer review, I could probably do the review, but unless it happens to be related to something my employer wants done, I can't really justify it. Even on a volunteer level the same thing probably applies - if I use glossary a lot and there's a minor bug I really want to fix, I'll submit a fix for the bug, but let's suppose there is something waiting in peer review about forum, I wouldn't have any interest or expertise in reviewing it (supposing I even see it).

Evil thought: You could have some way that when you mark an issue waiting for peer review, if you then yourself peer-review some other issue (while waiting), your waiting issue gets a priority peer review tag or something...

I must say that until Jerome's left I was lucky because I could always ask him to review my contributions (mostly WebServices an bug fixes also related to Web Services)

Currently, I'm not so lucky and I must beg for peer-reviews, I think that you listed perfectly the causes of this problem

I think that peer-review could be part of the Moodle development process, Monday and Tuesdays are for integrator review, Wednesday for testers.. why not Thursdays for peer-review? I know that peer-review is something that can happens at any time but... it would be like a "peer-review mini sprint"

Well, it can be also Tuesdays and Fridays (just an hour after lunch ) I think that a combination of triaging and slot-times for peer-review would help to solve the current problems.

I like also Sam's idea We can apply also gamification techniques here...

Tim and I often review each others' small changes, hopefully this is useful, but for large changes it can be inappropriate especially as we obviously come from the same institution so are more likely to have the same take on things.

Just to say that I think this is a practical way to work and I agree with your assessment that its better to avoid it for larger changes.

Say there is a simple issue waiting peer review, I could probably do the review, but unless it happens to be related to something my employer wants done, I can't really justify it.

Yeah, although in a non-direct way, it is of benefit to your employer - if there is one less issue waiting for peer review it (ideally) means the chances of your issue getting reviewed sooner are increased. So from a theoretical point of view I like your evil idea, though from a practical point of view I think we are all concerned that putting even more obligations on contributors is tough path to go down.

Of course, when you say "There is a degree of expectation that Moodle HQ should be resonsible for all peer reviews." you forget about components like quiz/question where there is an active component maintainer.

I manage to review 90+% of patches very quickly. However, once an issue has got stuck in limbo, it can sit there for a very long time. I think that is just me.

I completely agree that quick turn-around on reviews is important for encoraging contributors. Working out any given change involes creating a certian amount of metal state to understand the code, and it is much easier to make any requested changes while you still remember all that. Similarly, if peer review something, and point out that changes are needed, it is easiest if those changes come back quickly for re-peer-review, while you still remember how the patch worked. So, quick turn arounds are important on both sides.

It is certainly worth hilighing small changes. I have a proposal for that:

CiBiT should not just add comments to issues. For an issue that gets reworked many times, it just adds lots of clutter. Instead I think we should have specific custom fields like:

CiBoT status: [blank / all good / problems]CiBoT last results: [link to the smurf.html file]CiBoT commit checked: [commit hash of what was checked, as a link to github]CiBoT diff size: [git diff --short-stat output]

That last filed makes it easy to search for small changes. The other fields make it easy to see at a glance what the CiBoT status is, and avoid cluttering the comment history with this. If you ever need to see all the versions that were checked, the tracker keeps a full history of all fields.

I forgot one thing I wanted to say. Regarding "Some issues are so big/fundamental/controversial they get ignored because the peer reviewer wouldn't know the next step."

Any issue like that should not be up for peer review. Before any code was written there should have been a thread in a forum, that should have reached some sort of consensus.

To put it another way, if I am responsible for peer reviewing something controversial, and the developer disagrees with me, then I don't want to have a two-person argument with the developer. I want to developer to start a forum thread to canvas more opinions. Hopefully, with enough eyes, a good solution can be found, or at least the least bad compromise.

But frequently are.. and its fair enough because its not always clear to the person who wrote the code why their change is controversial. Indeed explaining that is also something which probably stops people from reviewing those changes..

Making it a bit more obvious that it is something you want developers to do

Making it clear how to do it

Providing easy access to a list of what needs reviewing.

To be honest it never really occurred to me that I could be peer reviewing things - I guess the phrase comes with a certain weight of responsibility. This is linked to your point 3 but is more "am I qualified to do it" rather than "I'm not expected to do it".

For my point 1 I just did some searching to find out how to peer review and found this:

"The component lead should peer-review the change. If there is no component lead for an affected component, any other recognised developer may complete the peer review."

For 3, I just went looking for tickets that needed peer reviewing. Although I could create my own filter in JIRA for tickets with status of Peer Review I eventually also found the Peer Review Dashboard, which is much more helpful. It was tucked away though (Under Tools > Find Dashboards > Search for "Peer Review"). I think it would be good to make it much more prominent.

I definitely agree that timeliness is critical to encourage engagement. It's discouraging as a dev to write up patches and features for Moodle and see them sit around for weeks or months (and then get bumped back to the next 6mo release cycle). It makes it harder to stay engaged.

Part of the problem that I personally see if that it's not clear that me doing a peer review is actually helpful overall. Yes, it might catch things, but it doesn't really move the issue forward in the process. If I do a peer review, I can only either leave it in 'Waiting for peer review' status, or send it back to dev in progress. Since I don't have powers to send it to integration, it still has to sit around waiting for someone with the power to do so to mark it for integration. Maybe another ticket state, or a label, to indicate that peer review has been done, and it needs a less in-depth review by someone with power for submission.

Basically, the bottle neck is sending for integration, as that is the part that has the fewest people who can contribute, but the process as it is laid out now tangles the peer review and the sending for integration into one apparent step. They should be disentangled (or the number of devs trusted to mark for integration should be expanded, especially since there is a more robust integration review process in place anyways).

Spot on, Eric - without a (standard) way to differentiate peer-reviewed tickets from ones which are still awaiting peer review (other than sending to integration, which as you say is a permission that not many people have), it doesn't seem as though it'd really help much. In practice, it might be that once you post a comment with your peer review results, somebody from HQ / with permission to put it up for integration is more likely to see those results and push it along - but it'd be much more likely for that to happen if we could either update the ticket state (probably the preferable option) or add an agreed-upon tag in order to flag it as having been reviewed, as otherwise it'd rely on somebody with the power already being at least a watcher on the ticket (which isn't always the case).

(Sorry if this is a bit of a jumbled mess - it's a quick brain dump before flying out the door well past usual home time!)

Yeah, I agree. I also think that the peer reviewer having a track record of saying 'this is ready for integration' at the right time would be a good way of determining if the peer reviewer has proven to know when something is ready and the ability to send for integration.

In an ideal world everyone would be able to send for integration if they had been peer reviewed - the reason the permission is controlled is just so that we don't have an incredibly huge list of issues sent for integration which are unsuitable, waste time and mean good quality issues don't get processed. (Or the same problem as we have in peer review in integration review!).

I plan to raise this at our next Integration team meeting.

In the meantime, as I mentioned in my reply to Eric - feel free to use the label 'ready-for-integration' on issues and I will aim to do it informally. (Hehe I suppose I should say, loose my trust by abusing that label at your peril! )

the number of devs trusted to mark for integration should be expanded,

IIRC the written policy on having the powers for integration are that its reserved for component maintainers and Moodle HQ. Personally, I think we could afford to expand that - but its hard to write a policy for who you allow. I mean my policy would be something along the lines of 'people who have a track record of not sending rubbish to integration', which includes many more people than those who have enough time to volunteer to maintain a component. In fact I believe this was essentially the same as the old way we gave CVS access. Of course, i'm also the kind of person who doesn't believe there is a need to have a written policy for everything.

I agree that 'sending for integration' is problem (of process and communication) but I disagree with your assessment that is is the bottleneck compared to actually reviewing the issues. Anyway in the meantime, while an official solution is worked out - i'm happy to help with this. I've just setup a filter which searches for 'project = MDL AND labels in (ready-for-integration)' and emails me every day. Feel free to use this label :P

My main problem with peer reviewing is that after I peer review something and the developer fixes whatever needs to be fixed, then it kinda ends up in a "limbo" state and I sometimes need to tag someone else in the ticket (usually someone from Moodle HQ) to beg them that this is ready for integration review.

I like the idea of using the "ready-for-integration" label and will try to remember that the next time I peer review.

Also, I agree with Tim's comments that the CI bot often spams the JIRA comments. It would be nice if fields in the ticket were updated instead as to what, if any, problems the CI bot sees.

Wikipedia has a lot of initiatives to encourage new contributors to get involved in review process. Tools too.

A question to consider though is do we want to encourage people who are not experienced developers to do peer reviews? Particularly when it comes to coding standards, do we want to trade quality for quantity? It sounds like from the thread above that perhaps we do. Is there then maybe another stage of review before integration?

No. Its worth noting that we always have the integration review to follow the peer review, so there is a certain degree of backstop there anyway. But that does not mean an 'inexperienced with Moodle' developer can't do a peer review and provide considerable value. In fact this already happens because in Moodle HQ we have new developers starting relatively frequently. They don't always have previous Moodle experience, but can provide different perspectives to a problem.

I've gone into the "issues waiting for peer review" a few times and got as far as selecting an issue I thought I might know something about. (I should say I'm also in the situation Sam Marshall describes where it's difficult to justify spending work time on peer review, but I'm happy to put some time into it on a rainy lunch hour (which is most of them here in Glasgow ))

The main thing that put me off actually doing it was your point 4 - that I wasn't sufficiently knowledgeable to catch everything. Any time I've had code peer reviewed there have always been useful comments on things I hadn't even considered, and I wasn't convinced I would be able to give that kind of depth of feedback ("this would break during upgrade", "this will give the wrong behaviour if $CFG->someobscuresetting is set", "that function will be deprecated in the next release" - all those kind of things).

It's only on reading this forum thread that I'm now realising this was based on a misunderstanding. I always thought that the integration review was mainly to do with catching regressions and things like that, and that the peer review was effectively the actual code review, but on reading the docs on the integration review I see now that it includes a "final code review". This puts a different perspective on peer review for me and I'd be much more inclined to go and do one now.

I wonder if this is part of the issue - that peer reviews are perceived as holding more weight and responsibility than maybe they're intended to?

Please could I add that there appear to be a number of things going on here, in this thread.

1. code/peer review for enhancement (those items that are established and need a fix or updating)

2. new items for new releases, arguably more complex code, thus more senior/experienced peer-review required (I could be wrong)

3. Workloads....peeps are always going to be comfortable in their own zone of expertise and in addition those working elsewhere need to prioritise

1=is there a process for checking existing items are a) still very much needed in practice and b) might it be worth blending those items that risk becoming obsolete with regard to application e.g. evaluation/survey....just a random example....I remember Sam stating that certain items, indeed do catch one's eye if they are of purpose to the dev....of course this makes sense....but it doesn't make sense to be fixing/enhancing items that may be passed their sell-by date, with respect.

2=I watched the Samba guy's videos....cant remember his name...sorry, but he stated clearly that- in this world of OS and what not the senior members of the community deal with the new items/peer review of those items and so on-I could be wrong.

3=Sounds as though, this has the potential to create a lot of confusion, I mean we have Glasgow/OU/Moodle among others as well- involved wth this and that is a lot of business in the fold to be juggling....Sam pointed out the difficulties of two people from the same institution working on the same code....tricky potential bias element

It just seems to be that a backwards approach might be a good start......reduce focus.......by putting to one side the established items that are not being used frequently...forum input might give an indication of this...and if a bug turns up......then those new to the arena might wish to gain experience of peer review that way.........or why not have a senior review with a junior.......not literally, but both have the code to review.......

Of course, if you never do peer reviews, you will not get better at peer reviewing

I think you just have to dive in and try your best.

It is helpful to just add a comment like

"I looked at the patch and spotted these issues ... however I don't feel confident with this area of the code, so I think someone else should also peer review it again after those issues are fixed, before it gets submitted for integration."

I think this illustrates the point I was trying to make, though. If someone adds a patch to an issue I'm watching I'll generally have a quick look at it, and on the odd occasion when I've noticed a flaw I'll leave a comment like this, so I guess I've been doing peer review. I think the problem is that I had a perception of The Peer Review being something a lot more formal, but I think I'm beginning to see that that's not necessarily the case.

Ideally, everything you talk about being picked up in peer review would be of course. But, many of you who are reading this thread have much more experience in some of the tricky Moodle issues and caveats than perhaps you are giving yourself credit for! We are all learning all the time and of course it is called peer review for a reason!

It would not be a surprising scenario to see a number of people pass on doing a peer review because they don't feel they are qualified for it, the issue waits around for two months and then gets reviewed by a relatively new Moodle HQ developer who is much less exposed to Moodle's subtelties than any of the community members who left it.

I don't want to give the wrong impression though, I think that this concern about reviewing something which you might not be the best person to review is natural and indeed its something which I have struggled with in integration review too. It'd be great if we had an army of component maintainers who are reviewing things as soon as they are put up for peer review, but its not todays reality and I think that slow/no reviews are doing much more harm than reviews with some expert knowledge missing.

"Peer review" is another phrase that can be used as a substitute for "Quality Control". From a systems perspective, It does not really matter who checks what, as long as they are competent to do the checking. It does matter how many layers of "peer review" you want to put into place. The more "peer review" the better the end product, as long as the peers are competent.

Being in this position for something completely different, I know it is sometimes a nerve-wrecking experience to be reviewed, but it is an essential one. After all, higher standards of quality control are the end result and it is never, never personal.

A possible problem I think I am seeing is that the frequency of releases and updates are not helping maintain high levels of quality control. Please, do not misunderstand or misinterpret me, the product is good so far, but for how much longer? If standards inadvertently slip, as Tim seems to be indicating, then small errors accumulate, and large errors result. I am sure the testers, the Devs are all doing the best they can, but increasingly we are seeing complex systems collapse because of accumulated errors, even with high quality control standards. I suspect Moodle is no different.

I would offer that the release schedule could be cut in half and the freed up time could be devoted to quality control of upcoming releases. The community would not mind, I suspect, no-one is breaking their neck to have a buggy product. They would much rather a slower release of a healthy, tested and a fully integrated product.