Reason for difference is the scores were added multiple times to the total and non-emply grades were less because the count of students with graders was greater because of the multiple roles. ie: 14 students - 9 grades = 5 non empty grades when it should be 6.

I got here be investigating a related but different problem. Some queries in the gradebook correctly check AND u.deleted = 0, and others, including the two involved here, don't. Because of this, we were seeing averages that looked like -6.7 (-3) (that is, an average of minus three users) because it was subtracting a number of ungraded users that included deleted users from a total number of users that excluded deleted users.

However, grepping the gradebook code for "->gradebookroles" I find a number of other database queries which seem to suffer from similar problems, so I think we need to do a more careful review and try to fix all of them before we push this into 1.9. We also need to prepare a patch/branch for 2.0.

Tim Hunt
added a comment - 05/Mar/11 3:05 AM First note that MDL-15277 may well be a duplicate of this bug.
Next note that MDL-25879 is somewhat related.
I got here be investigating a related but different problem. Some queries in the gradebook correctly check AND u.deleted = 0, and others, including the two involved here, don't. Because of this, we were seeing averages that looked like -6.7 (-3) (that is, an average of minus three users) because it was subtracting a number of ungraded users that included deleted users from a total number of users that excluded deleted users.
I have now turned Tim Lock's patch, and a patch by me to take into account the deleted users issue, into a branch here: https://github.com/timhunt/moodle/compare/MOODLE_19_STABLE...MDL-20946_19
However, grepping the gradebook code for "->gradebookroles" I find a number of other database queries which seem to suffer from similar problems, so I think we need to do a more careful review and try to fix all of them before we push this into 1.9. We also need to prepare a patch/branch for 2.0.

By the way, I am getting hassled about this in relation to the OU VLE, so I would like to get this fixed ASAP. I know you have a public holiday in Australia on Monday, but hopefully we can work on this in the remainder of next week.

Tim Hunt
added a comment - 05/Mar/11 3:06 AM By the way, I am getting hassled about this in relation to the OU VLE, so I would like to get this fixed ASAP. I know you have a public holiday in Australia on Monday, but hopefully we can work on this in the remainder of next week.

OK, so reviewing all the mentions of ->gradebookroles in the code, as a plausible way to find all the DB queries that might be affected.

1. grade/import/lib.php line 175: Seems to be OK. Hmm... unless we should change
"AND ra.id IS NULL" to "AND (ra.id IS NULL OR u.deleted = 1)"?

2. grade/lib.php class graded_users_iterator::init(): Two queries here, both of which will duplicate users if they have multiple role assignments. Yes. For example, this is used by all the export code. If you have a user with two roles, then they are included twice in the export file.

(By the way, it is crazy the way this code is doing the $ofields stuff. Surely that is not really necessary!)

(Note that there is a nasty use of Error here, that gives the wrong place to go for the setting admin->appearance->graderoles)

4. grade/report/lib.php grade_report::get_numusers method: No problem with the multiple roles thing, because it is using COUNT(DISTINCT ...)) in includes u.delete = 0.

5. grade/report/grader/lib.php grade_report_grader::load_users method: Oh look at the comment at the top "// the MAX() magic is required in order to please PG". No, Postgres was pointing out this bug, and you chose to ignore it!

Anyway the queries here are wrong, but this is probably masked by the fact that get_records_sql eliminates duplicate rows when it converts the recordset to an array. Well, one of the queries does a SELECT DISTINCT, but those tend to give bad performance. Mind you, I am not sure that SELECT ... WHERE EXISTS is particularly efficient either.

(I also wonder why this method cannot use the graded_users_iterator class? Anyway, I will just fix the existing code.)

There is horrible duplicated code here. Yes, I will refactor.

6. grade/report/grader/lib.php grade_report_grader::get_avghtml method: This is where we started, and my previous commits should have fixed this. I just reviewed the code again and tweaked the coding style a bit.

7. mentions in mod/assignment/lib.php are not relevant to us (and appear to be fine on a quick look.)

Note that there are mentions of gradebookroles in other places in Moodle 2.0. For example, in 2.0 there are separate queries in the user report that are not present in 1.9. For now, let's get 1.9 right, before porting the changes to 2.0.

Note, I just did a quick test, which I think covered all the code I changed, and I think it worked for me.

Tim Hunt
added a comment - 08/Mar/11 8:07 PM OK, so reviewing all the mentions of ->gradebookroles in the code, as a plausible way to find all the DB queries that might be affected.
1. grade/import/lib.php line 175: Seems to be OK. Hmm... unless we should change
"AND ra.id IS NULL" to "AND (ra.id IS NULL OR u.deleted = 1)"?
2. grade/lib.php class graded_users_iterator::init(): Two queries here, both of which will duplicate users if they have multiple role assignments. Yes. For example, this is used by all the export code. If you have a user with two roles, then they are included twice in the export file.
(By the way, it is crazy the way this code is doing the $ofields stuff. Surely that is not really necessary!)
3. grade/report/lib.php grade_report::grade_report constructor: Just copies $this->gradebookroles = $CFG->gradebookroles; No problem.
(Note that there is a nasty use of Error here, that gives the wrong place to go for the setting admin->appearance->graderoles)
4. grade/report/lib.php grade_report::get_numusers method: No problem with the multiple roles thing, because it is using COUNT(DISTINCT ...)) in includes u.delete = 0.
5. grade/report/grader/lib.php grade_report_grader::load_users method: Oh look at the comment at the top "// the MAX() magic is required in order to please PG". No, Postgres was pointing out this bug, and you chose to ignore it!
Anyway the queries here are wrong, but this is probably masked by the fact that get_records_sql eliminates duplicate rows when it converts the recordset to an array. Well, one of the queries does a SELECT DISTINCT, but those tend to give bad performance. Mind you, I am not sure that SELECT ... WHERE EXISTS is particularly efficient either.
(I also wonder why this method cannot use the graded_users_iterator class? Anyway, I will just fix the existing code.)
There is horrible duplicated code here. Yes, I will refactor.
6. grade/report/grader/lib.php grade_report_grader::get_avghtml method: This is where we started, and my previous commits should have fixed this. I just reviewed the code again and tweaked the coding style a bit.
7. mentions in mod/assignment/lib.php are not relevant to us (and appear to be fine on a quick look.)
Note that there are mentions of gradebookroles in other places in Moodle 2.0. For example, in 2.0 there are separate queries in the user report that are not present in 1.9. For now, let's get 1.9 right, before porting the changes to 2.0.
Note, I just did a quick test, which I think covered all the code I changed, and I think it worked for me.
OK, all these changes pushed to https://github.com/timhunt/moodle/compare/MOODLE_19_STABLE...MDL-20946_19 . I am looking for review and testing please.

Sadly, it occurred to me to search all the gradebook code for 'SELECT' and there are some more queries with problems.

8. In both grade/report/overview/lib.php and grade/report/user/lib.php inside the if ($this->showrank) { bit, it should ignore grades belonging to un-enrolled or deleted users when computing the rank, but it includes them. The OU does not use ranks, so I am not inclined to fix this, Perhpas we could file a separate MDL bug about that.

9. Drat! I missed the second query in graded_users_iterator::init(). I'll do another commit. Done.

Tim Hunt
added a comment - 08/Mar/11 9:44 PM Sadly, it occurred to me to search all the gradebook code for 'SELECT' and there are some more queries with problems.
8. In both grade/report/overview/lib.php and grade/report/user/lib.php inside the if ($this->showrank) { bit, it should ignore grades belonging to un-enrolled or deleted users when computing the rank, but it includes them. The OU does not use ranks, so I am not inclined to fix this, Perhpas we could file a separate MDL bug about that.
9. Drat! I missed the second query in graded_users_iterator::init(). I'll do another commit. Done.
But otherwise, I think we have now found all the problems.

Tim Hunt
added a comment - 09/Mar/11 2:33 AM One of the grader report queries was broken in groups mode. Thanks to Colin for finding that.
Acutally, I can feel a bit smug, because it was Tim Lock's original patch that broke that That is the only problem found so far, and I have just done another commit to fix it.
So, the state of play is that https://github.com/timhunt/moodle/compare/MOODLE_19_STABLE...MDL-20946_19 is, to the best of my knowledge, the correct fix for 1.9. However it needs more testing.

Prior to working through the 1.9 patch Ive had a go at reproducing this in 2.0 and have done so. I made student and non-editing teacher gradeable roles then made a user that was already a student also a non-editing teacher. Averages immediately become whacky.

Ill set this same situation up in 1.9, apply the patch and see how we go

Andrew Davis
added a comment - 17/Mar/11 11:52 AM Prior to working through the 1.9 patch Ive had a go at reproducing this in 2.0 and have done so. I made student and non-editing teacher gradeable roles then made a user that was already a student also a non-editing teacher. Averages immediately become whacky.
Ill set this same situation up in 1.9, apply the patch and see how we go

Is there are reason Im not seeing for not doing it using an exists clause? The way youve done it certainly seems to work just fine but seems a little more complex so Im curious if Im not seeing something or if its just personal preference.

re the change in grade/import/lib.php, be wary of white space changes. They can make the integrators unhappy.

Andrew Davis
added a comment - 17/Mar/11 3:54 PM ok. Looking at https://github.com/timhunt/moodle/compare/MOODLE_19_STABLE...MDL-20946_19 and I have that branch on my machine.
Youve added queries that look like this:
SELECT u.id, u.firstname, u.lastname, u.imagealt, u.picture, u.idnumber
FROM mdl_user u
WHERE EXISTS (
SELECT 1
FROM mdl_role_assignments ra
WHERE u.id = ra.userid
AND ra.roleid IN (4,5)
AND ra.contextid IN (14,3,1)
)
AND u.deleted = 0
The following also seems to resolve the issue of duplicate rows. Using a distinct instead of the exists.
SELECT distinct u.id, u.firstname, u.lastname, u.imagealt, u.picture, u.idnumber
FROM mdl_user u
LEFT JOIN mdl_role_assignments ra ON u.id = ra.userid
WHERE ra.roleid IN (4,5)
AND ra.contextid IN (14,3,1)
AND u.deleted = 0
Is there are reason Im not seeing for not doing it using an exists clause? The way youve done it certainly seems to work just fine but seems a little more complex so Im curious if Im not seeing something or if its just personal preference.
re the change in grade/import/lib.php, be wary of white space changes. They can make the integrators unhappy.
Otherwise I think it looks good

In the past there have been nasty performance problems with SELECT DISTINCT queries, particularly where there are a lot of columns. This about what the DB has to do to implement it. Of course, we really ought to measure to confirm that the WHERE EXISTS (SELECT) version works better. I will try to do some basic timings of the queries on our live DB to see what I get.

I will get my colleagues to test the functionality on our acceptance test server. We want this bug fix ASAP.

I will re-base the branch onto the latest MOODLE_19_STABLE, including taking out the white-space only changes to grade/import/lib.php.

Tim Hunt
added a comment - 17/Mar/11 4:15 PM Thanks for looking at this.
In the past there have been nasty performance problems with SELECT DISTINCT queries, particularly where there are a lot of columns. This about what the DB has to do to implement it. Of course, we really ought to measure to confirm that the WHERE EXISTS (SELECT) version works better. I will try to do some basic timings of the queries on our live DB to see what I get.
I will get my colleagues to test the functionality on our acceptance test server. We want this bug fix ASAP.
I will re-base the branch onto the latest MOODLE_19_STABLE, including taking out the white-space only changes to grade/import/lib.php.
Will you then be able to do the 2.0 changes?

Tim Hunt
added a comment - 17/Mar/11 8:19 PM However, in this case I was wrong:
My query: 11,000 ms
Your query: 50 ms
Best query: 25 ms
SELECT u.id, u.firstname, u.lastname, u.imagealt, u.picture, u.idnumber
FROM mdl_user u
WHERE u.deleted = 0 AND u.id IN (
SELECT DISTINCT ra.userid FROM mdl_role_assignments ra
WHERE ra.roleid IN (5, 162)
AND ra.contextid IN (1161021, 830650, 32, 1))
That idiom seems to work better on other queries too, so I adoped it everywhere.
See the last commit on https://github.com/timhunt/moodle/compare/MOODLE_19_STABLE...MDL-20946_19 - note I have rebased this branch.
I also did a version with cleaned up history: https://github.com/timhunt/moodle/compare/MOODLE_19_STABLE...MDL-20946_19_simplehistory . I don't think there is any benefit in the full history, so I am proposing to use this version as the basis for a PULL request, once we have done a bit more testing here at the OU.

Tim Hunt
added a comment - 17/Mar/11 8:39 PM Grrr! I managed to screw up a few things. https://github.com/timhunt/moodle/compare/MOODLE_19_STABLE...MDL-20946_19_simplehistory now rewritten and fixed. I now think this is working on my development server. Time to put it on the OU's acceptance test server.

Grrr! Why fail PULL-487 out-of-hand because it is the sister of PULL-486, which affects 2.0 where the code if different because of the enrol stuff?

And, if you actually bother to read what was written above, then you will see that I did test all the 1.9 queries on the OU's database. (700,000+ users, 100,000+ role assignments, 100,000+ contexts. 3,000+ courses - I think it is actually much more than that but I don't have the exact number right now.)

But if you want to do more testing with a bigger database, be my guest

Tim Hunt
added a comment - 22/Mar/11 7:55 PM Grrr! Why fail PULL-487 out-of-hand because it is the sister of PULL-486, which affects 2.0 where the code if different because of the enrol stuff?
And, if you actually bother to read what was written above, then you will see that I did test all the 1.9 queries on the OU's database. (700,000+ users, 100,000+ role assignments, 100,000+ contexts. 3,000+ courses - I think it is actually much more than that but I don't have the exact number right now.)
But if you want to do more testing with a bigger database, be my guest

I read this 100% (in fact I did that more than once), so I knew perfectly about your results above, so please, don't make false assumptions.

And they where failed because both are using the same (IN + subquery having DISTINCT) thing. I'm sure you tested it under PostgreSQL... but what happens with MySQL (yes, that tiny database that nobody uses).

So I think it's fair enough to dedicate some time for testing and comparing for one more week. The rest of the patch seemed perfect, and that was stated in the PULL request, so I really cannot see any problem with the decision.

Eloy Lafuente (stronk7)
added a comment - 22/Mar/11 9:25 PM Tim,
I read this 100% (in fact I did that more than once), so I knew perfectly about your results above, so please, don't make false assumptions.
And they where failed because both are using the same (IN + subquery having DISTINCT) thing. I'm sure you tested it under PostgreSQL... but what happens with MySQL (yes, that tiny database that nobody uses).
So I think it's fair enough to dedicate some time for testing and comparing for one more week. The rest of the patch seemed perfect, and that was stated in the PULL request, so I really cannot see any problem with the decision.
Ciao

Andrew Davis
added a comment - 24/Mar/11 5:01 PM Ok, in this issue we've created queries like this...
SELECT u.* $ofields
FROM {$CFG->prefix}user u
INNER JOIN {$CFG->prefix}role_assignments ra ON u.id = ra.userid
$groupsql
WHERE ra.roleid $gradebookroles
AND ra.contextid $relatedcontexts
WHERE u.id IN (
SELECT DISTINCT ra.userid
FROM {$CFG->prefix}role_assignments ra
WHERE ra.roleid $gradebookroles
AND ra.contextid $relatedcontexts
)
AND u.deleted = 0
$groupwheresql
ORDER BY $order
From what I can gather from the comments here and in the 2 pull requests Eloy and Petr are suggesting we try something like this...
SELECT u.* $ofields
FROM {$CFG->prefix}user u
INNER JOIN {$CFG->prefix}role_assignments ra ON u.id = ra.userid
$groupsql
WHERE ra.roleid $gradebookroles
AND ra.contextid $relatedcontexts
JOIN (
SELECT DISTINCT ra.userid
FROM {$CFG->prefix}role_assignments ra
WHERE ra.roleid $gradebookroles
AND ra.contextid $relatedcontexts
) uinner ON u.id = uinner.userid
WHERE u.deleted = 0
$groupwheresql
ORDER BY $order
Which is actually a variation we didnt try. I think... Anyhow, Tim, would you be able to test that variation against your dataset?

Andrew, I don't understand your comment. I don't think any of our queries have both a INNER JOIN {$CFG->prefix}role_assignments ... and a WHERE u.id IN (... Is that just a copy/paste error?

I thought Eloy was just suggesting we try removing the DISTINCT from WHERE u.id IN (SELECT DISTINCT ...

The switch from IN to JOINing a view is an idea that might be worth trying. My guess (and experience) would be that in cases where we have another way to limit the list of users (either $groupsql, or a join no grade_grades, the JOIN will perform worse, because the DB will probably evaluate the inner query for the whole role_assignments table, whereas we only need to think about a small subset of users. In the particular query you show, with no other restriction on userid, the JOIN may be slightly faster, or about the same speed, as the query I wrote.

Anyway, the only way to know is to test on a large example database on both Postgres and MySQL (and the other DBs if you can be bothered.)

Tim Hunt
added a comment - 24/Mar/11 5:47 PM Andrew, I don't understand your comment. I don't think any of our queries have both a INNER JOIN {$CFG->prefix}role_assignments ... and a WHERE u.id IN (... Is that just a copy/paste error?
I thought Eloy was just suggesting we try removing the DISTINCT from WHERE u.id IN (SELECT DISTINCT ...
The switch from IN to JOINing a view is an idea that might be worth trying. My guess (and experience) would be that in cases where we have another way to limit the list of users (either $groupsql, or a join no grade_grades, the JOIN will perform worse, because the DB will probably evaluate the inner query for the whole role_assignments table, whereas we only need to think about a small subset of users. In the particular query you show, with no other restriction on userid, the JOIN may be slightly faster, or about the same speed, as the query I wrote.
Anyway, the only way to know is to test on a large example database on both Postgres and MySQL (and the other DBs if you can be bothered.)

http://dev.mysql.com/doc/refman/5.1/en/subquery-restrictions.html describes some mysql problems with IN (SELECT ...), we have already tested the JOIN (SELECT DISTINCT ...) approach in enrol related queries. I think we were bitten by this a few times before and after a few nights in stats hacking I am not using it any more - for uncorrelated subqueries the JOIN is better, EXISTS works fine for the rest. Please note 1.9.x supports older versions of mysql which may affect the results too.

Petr Skoda
added a comment - 24/Mar/11 9:40 PM http://dev.mysql.com/doc/refman/5.1/en/subquery-restrictions.html describes some mysql problems with IN (SELECT ...), we have already tested the JOIN (SELECT DISTINCT ...) approach in enrol related queries. I think we were bitten by this a few times before and after a few nights in stats hacking I am not using it any more - for uncorrelated subqueries the JOIN is better, EXISTS works fine for the rest. Please note 1.9.x supports older versions of mysql which may affect the results too.

Andrew Davis
added a comment - 25/Mar/11 11:19 AM Tim, apparently I cant reliably copy and paste. Sorry for the confusion.
So I believe we have 2 alternatives that need to be compared.
2) WHERE u.id IN (SELECT ra.userid... with no distinct in the inner query
3) JOIN (SELECT DISTINCT ra.userid...
) rainner ON u.id = rainner.userid
Tim's testing above found that EXISTS performed terribly.

I'm using mysql and Moodle 2.0. I used /admin/generate.php to create a whole bunch of users, course, role assignments etc. I'm just using Moodle's own profiling to get some numbers. The numbers Tim gets will be more valid as its based on real data.

First, I removed the distinct in the inner query. Execution time to view the grade report dropped 0.3%. Memory usage went up 0.1%. No effect essentially.

Now going to try using a JOIN instead....

Switching from IN to JOIN seems to make the grader report execution time 5% slower. Kind of weird but I suspect thats within the "random variation" range so it probably doesn't indicate that switching would be a bad idea.

Andrew Davis
added a comment - 25/Mar/11 3:07 PM - edited I'm using mysql and Moodle 2.0. I used /admin/generate.php to create a whole bunch of users, course, role assignments etc. I'm just using Moodle's own profiling to get some numbers. The numbers Tim gets will be more valid as its based on real data.
First, I removed the distinct in the inner query. Execution time to view the grade report dropped 0.3%. Memory usage went up 0.1%. No effect essentially.
Now going to try using a JOIN instead....
Switching from IN to JOIN seems to make the grader report execution time 5% slower. Kind of weird but I suspect thats within the "random variation" range so it probably doesn't indicate that switching would be a bad idea.

Andrew Davis
added a comment - 25/Mar/11 3:52 PM - edited Ive pushed the 2.0 version using JOIN to github so I can access from home. https://github.com/andyjdavis/moodle/compare/master...MDL-20946_roles_averages

What you should do is, having created the huge test database, then run the queries directly from the command-line (or phpMyAdmin, and the equivalent for other DBs). That lets you time just the query with minimal noise.

You should also use Explain to see how the database is processing the query.

Tim Hunt
added a comment - 25/Mar/11 4:01 PM What you should do is, having created the huge test database, then run the queries directly from the command-line (or phpMyAdmin, and the equivalent for other DBs). That lets you time just the query with minimal noise.
You should also use Explain to see how the database is processing the query.

Running both the JOIN and the IN versions of the queries directly in mysql query browser I get the following...

Using JOIN: average time 0.0480 seconds
Using IN: average time 0.0469 seconds

Thats running each one half a dozen times and averaging the results. Again, switching from IN to JOIN actually makes the query marginally slower. The difference is so small I dont think it really matters either way. If Petr says there are potential problems with using the IN version then Im happy to go with the JOIN version.

Tim, are you happy to repeat this testing in the 1.9 version and update the branch in your repository? (depending on whether or not your testing differs wildly from mine).

Andrew Davis
added a comment - 29/Mar/11 11:32 AM - edited Running both the JOIN and the IN versions of the queries directly in mysql query browser I get the following...
Using JOIN: average time 0.0480 seconds
Using IN: average time 0.0469 seconds
Thats running each one half a dozen times and averaging the results. Again, switching from IN to JOIN actually makes the query marginally slower. The difference is so small I dont think it really matters either way. If Petr says there are potential problems with using the IN version then Im happy to go with the JOIN version.
Tim, are you happy to repeat this testing in the 1.9 version and update the branch in your repository? (depending on whether or not your testing differs wildly from mine).

No, I am not happy to re-test. I have got lots of other things to try to get done by the end of this week.

I have had previous experiences where a JOIN query like this was a performance disaster, and an IN query was better. There are potential problems with either way of writing the query, depending on the whims of the databases' query optimisers. The only reliable option is to test the performance, which we have now done.

We have some code (thin IN version) that performs well on our two main database, and has been tested to make sure it is functionally correct. It think we should integrate that.

If anyone else wants to do more work on this, then be my guest, but if we agonise this much about every change we make, then Moodle development will grind to a halt.

Tim Hunt
added a comment - 29/Mar/11 2:42 PM No, I am not happy to re-test. I have got lots of other things to try to get done by the end of this week.
I have had previous experiences where a JOIN query like this was a performance disaster, and an IN query was better. There are potential problems with either way of writing the query, depending on the whims of the databases' query optimisers. The only reliable option is to test the performance, which we have now done.
We have some code (thin IN version) that performs well on our two main database, and has been tested to make sure it is functionally correct. It think we should integrate that.
If anyone else wants to do more work on this, then be my guest, but if we agonise this much about every change we make, then Moodle development will grind to a halt.

Sorry, that was a bit of a rant. The key point here is that we are not worrying about one query being 5% or 10% slower than another - that will always tend to depend on the exact statistics of the data in the various tables, and the way the query optimiser works in a particular version of a particular database.

The thing you do have to worry about is if a particular database query triggers a really pathological case in the query optimiser of some database. When that happens, the query can take 1000 times longer than necessary (e.g. minutes not miliseconds). That is what you have to worry about and avoid.

That is clearly not happening with either of the proposed queries, so we don't have to worry.

Tim Hunt
added a comment - 29/Mar/11 2:47 PM Sorry, that was a bit of a rant. The key point here is that we are not worrying about one query being 5% or 10% slower than another - that will always tend to depend on the exact statistics of the data in the various tables, and the way the query optimiser works in a particular version of a particular database.
The thing you do have to worry about is if a particular database query triggers a really pathological case in the query optimiser of some database. When that happens, the query can take 1000 times longer than necessary (e.g. minutes not miliseconds). That is what you have to worry about and avoid.
That is clearly not happening with either of the proposed queries, so we don't have to worry.

re the rant, don't worry about it. Our current process is prone to being rage inducing

Unfortunately, with you saying to use IN and Petr saying to use JOIN I don't feel that I am able to make this decision. Ultimately you and I can make whatever decision we like but if Petr doesn't agree then the fix will simply be rejected. I'll try and get Petr back to comment on this issue before I bother creating pull requests. I'm trying to avoid another MDL-25718 (5 rejected pull requests so far).

Andrew Davis
added a comment - 29/Mar/11 3:05 PM - edited re the rant, don't worry about it. Our current process is prone to being rage inducing
Unfortunately, with you saying to use IN and Petr saying to use JOIN I don't feel that I am able to make this decision. Ultimately you and I can make whatever decision we like but if Petr doesn't agree then the fix will simply be rejected. I'll try and get Petr back to comment on this issue before I bother creating pull requests. I'm trying to avoid another MDL-25718 (5 rejected pull requests so far).

My experience was that IN can be a performance disaster for mysql, until somebody actually proves it is good or wrong it should not hit any STABLE branch. At the moment it works only if students have one role, but that is probably majority of installs, we can introduce any major database change without testing because it may create serious performance problems for existing sites that use mysql, sorry.

To Tim: we are already using the JOIN method for enrolment queries and it seems to work fine so far, we did not test IN yet (if yes please tell me the file name and line number).

Andrew: your testing of IN x JOIN in 2.0 is not correct because the outer query is already limited by the enrol sql, the results in 1.9 may be different because the outer sql returns more data. You would have to change the enrol SQL to use IN too to get the results you wanted. Ideally the test should be done for a really large outer query (all users) and small number of inner query (enrolled users). You might want to create a new query just for that.

Petr Skoda
added a comment - 29/Mar/11 3:50 PM My experience was that IN can be a performance disaster for mysql, until somebody actually proves it is good or wrong it should not hit any STABLE branch. At the moment it works only if students have one role, but that is probably majority of installs, we can introduce any major database change without testing because it may create serious performance problems for existing sites that use mysql, sorry.
To Tim: we are already using the JOIN method for enrolment queries and it seems to work fine so far, we did not test IN yet (if yes please tell me the file name and line number).
Andrew: your testing of IN x JOIN in 2.0 is not correct because the outer query is already limited by the enrol sql, the results in 1.9 may be different because the outer sql returns more data. You would have to change the enrol SQL to use IN too to get the results you wanted. Ideally the test should be done for a really large outer query (all users) and small number of inner query (enrolled users). You might want to create a new query just for that.

Petr Skoda
added a comment - 29/Mar/11 3:53 PM In any case the mysql manual clearly says that "IN" has performance problems, that is enough for reject if you do not prove the docs do not apply to this case:
A typical case for poor IN subquery performance is when the subquery returns a small number of rows but the outer query returns a large number of rows to be compared to the subquery result.
The problem is that, for a statement that uses an IN subquery, the optimizer rewrites it as a correlated subquery. Consider the following statement that uses an uncorrelated subquery: ...

3. Sorry, I don't have time to do any more work on this. We have already deployed my fix to our live servers, because we needed it, and it works fine on Postgres. I guess we will go on using that until we move to 1.9.12.

Tim Hunt
added a comment - 29/Mar/11 4:49 PM So, following a discussion with Petr in dev chat:
1. The MySQL docs are http://dev.mysql.com/doc/refman/5.1/en/subquery-restrictions.html (still applies to 5.5 too.)
2. The correct solution is to rewrite the queries to be like the second query in this comment: http://tracker.moodle.org/browse/MDL-20946?focusedCommentId=107430&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-107430
3. Sorry, I don't have time to do any more work on this. We have already deployed my fix to our live servers, because we needed it, and it works fine on Postgres. I guess we will go on using that until we move to 1.9.12.
Chat log: http://moodle.org/mod/cvsadmin/view.php?conversationid=7100#c264809