We should now be ready for ref_id to be made the primary key. Please do check the results of those queries again (the two numbers should be the same) before making the change, because I'm not completely confident that duplicates won't sneak back in. That won't be a concern after https://gerrit.wikimedia.org/r/259444 .

Thanks for taking care of this. As you can see it already brought some improvements by making sure that there are no duplicate values. Also, it will allow easier schema changes from now on due to the presence of a primary key.